Goto

Collaborating Authors

 human communication


Emergent Communication in Interactive Sketch Question Answering

Neural Information Processing Systems

Vision-based emergent communication (EC) aims to learn to communicate through sketches and demystify the evolution of human communication. Ironically, previous works neglect multi-round interaction, which is indispensable in human communication. To fill this gap, we first introduce a novel Interactive Sketch Question Answering (ISQA) task, where two collaborative players are interacting through sketches to answer a question about an image. To accomplish this task, we design a new and efficient interactive EC system, which can achieve an effective balance among three evaluation factors, including the question answering accuracy, drawing complexity and human interpretability. Our experimental results demonstrate that multi-round interactive mechanism facilitates targeted and efficient communication between intelligent agents. The code will be released.


Conversational DNA: A New Visual Language for Understanding Dialogue Structure in Human and AI

arXiv.org Artificial Intelligence

What if the patterns hidden within dialogue reveal more about communication than the words themselves? We introduce Conversational DNA, a novel visual language that treats any dialogue -- whether between humans, between human and AI, or among groups -- as a living system with interpretable structure that can be visualized, compared, and understood. Unlike traditional conversation analysis that reduces rich interaction to statistical summaries, our approach reveals the temporal architecture of dialogue through biological metaphors. Linguistic complexity flows through strand thickness, emotional trajectories cascade through color gradients, conversational relevance forms through connecting elements, and topic coherence maintains structural integrity through helical patterns. Through exploratory analysis of therapeutic conversations and historically significant human-AI dialogues, we demonstrate how this visualization approach reveals interaction patterns that traditional methods miss. Our work contributes a new creative framework for understanding communication that bridges data visualization, human-computer interaction, and the fundamental question of what makes dialogue meaningful in an age where humans increasingly converse with artificial minds.


Emergent Communication in Interactive Sketch Question Answering

Neural Information Processing Systems

Vision-based emergent communication (EC) aims to learn to communicate through sketches and demystify the evolution of human communication. Ironically, previous works neglect multi-round interaction, which is indispensable in human communication. To fill this gap, we first introduce a novel Interactive Sketch Question Answering (ISQA) task, where two collaborative players are interacting through sketches to answer a question about an image. To accomplish this task, we design a new and efficient interactive EC system, which can achieve an effective balance among three evaluation factors, including the question answering accuracy, drawing complexity and human interpretability. Our experimental results demonstrate that multi-round interactive mechanism facilitates tar- geted and efficient communication between intelligent agents.


Beyond Prompts: Learning from Human Communication for Enhanced AI Intent Alignment

arXiv.org Artificial Intelligence

AI intent alignment, ensuring that AI produces outcomes as intended by users, is a critical challenge in human-AI interaction. The emergence of generative AI, including LLMs, has intensified the significance of this problem, as interactions increasingly involve users specifying desired results for AI systems. In order to support better AI intent alignment, we aim to explore human strategies for intent specification in human-human communication. By studying and comparing human-human and human-LLM communication, we identify key strategies that can be applied to the design of AI systems that are more effective at understanding and aligning with user intent. This study aims to advance toward a human-centered AI system by bringing together human communication strategies for the design of AI systems.


Can AI and humans genuinely communicate?

arXiv.org Artificial Intelligence

Can AI and humans genuinely communicate? In this article, after giving some background and motivating my proposal (sections 1 to 3), I explore a way to answer this question that I call the "mental-behavioral methodology" (sections 4 and 5). This methodology follows the following three steps: First, spell out what mental capacities are sufficient for human communication (as opposed to communication more generally). Second, spell out the experimental paradigms required to test whether a behavior exhibits these capacities. Third, apply or adapt these paradigms to test whether an AI displays the relevant behaviors. If the first two steps are successfully completed, and if the AI passes the tests with human-like results, this constitutes evidence that this AI and humans can genuinely communicate. This mental-behavioral methodology has the advantage that we don't need to understand the workings of black-box algorithms, such as standard deep neural networks. This is comparable to the fact that we don't need to understand how human brains work to know that humans can genuinely communicate. This methodology also has its disadvantages and I will discuss some of them (section 6).


SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

arXiv.org Artificial Intelligence

Human communication is a complex and diverse process that not only involves multiple factors such as language, commonsense, and cultural backgrounds but also requires the participation of multimodal information, such as speech. Large Language Model (LLM)-based multi-agent systems have demonstrated promising performance in simulating human society. Can we leverage LLM-based multi-agent systems to simulate human communication? However, current LLM-based multi-agent systems mainly rely on text as the primary medium. In this paper, we propose SpeechAgents, a multi-modal LLM based multi-agent system designed for simulating human communication. SpeechAgents utilizes multi-modal LLM as the control center for individual agent and employes multi-modal signals as the medium for exchanged messages among agents. Additionally, we propose Multi-Agent Tuning to enhance the multi-agent capabilities of LLM without compromising general abilities. To strengthen and evaluate the effectiveness of human communication simulation, we build the Human-Communication Simulation Benchmark. Experimental results demonstrate that SpeechAgents can simulate human communication dialogues with consistent content, authentic rhythm, and rich emotions and demonstrate excellent scalability even with up to 25 agents, which can apply to tasks such as drama creation and audio novels generation. Code and models will be open-sourced at https://github. com/0nutation/SpeechAgents


The subtle art of language: why artificial general intelligence might be impossible

#artificialintelligence

Consciousness is arguably the most mysterious problem humans have ever encountered. In many famous philosophical essays, consciousness is regarded as unsolvable. Yet, as we speak, engineers and cognitive scientists are putting their noses to the grindstone to develop consciousness in artificial intelligence (AI) systems. Typically, this project is referred to as the development of "artificial general intelligence" (AGI), which covers a wide range of cognitive and intellectual abilities that humans possess. Thus far, this project -- being conducted globally in 72 independent research projects -- has not produced conscious robots.


Trends of Cognitive Computing Organizations Need to Know Before 2022

#artificialintelligence

Cognitive computing is the amalgamation of cognitive science and is based on the basic premise of simulating the basic thought process. It is a combination of disruptive technologies like AI and machine learning with sentiment analysis and contextual awareness to solve daily problems, just like humans. It is used in different industries like healthcare, insurance and more. The goal of cognitive computing is to simulate human thought processes in a computerized model. Implementing self-learning algorithms that use data mining, pattern recognition and natural language processing, the machines can mimic the way human brains function.


Like a Shem that brought Golem to life

#artificialintelligence

A chatbot is a great way to make internet communication more pleasant for both customers and companies. At the beginning of the millennium, people would probably laugh at you for this sentence. The main reason for this change is Natural Language Processing (NLP). It is this branch of artificial intelligence science, that has transformed the clumsy and cumbersome automata into today's clever chatbots, which you can hardly tell from people sometimes. Thanks to NLP, artificial intelligence learns to understand something as complex as human communication.


Machines That Can Understand Human Speech: The Conversational Pattern Of AI

#artificialintelligence

Early on in the evolution of artificial intelligence, researchers realized the power and possibility of machines that are able to understand the meaning and nuances of human speech. Conversation and human language is a particularly challenging area for computers since words and communication is not precise. Human language is filled with nuance, context, cultural and societal depth, and imprecision that can lead to a wide range of interpretations. If computers can understand what we mean when we talk, and then communicate back to us in a way we can understand, then clearly we've accomplished a goal of artificial intelligence. This particular application of AI is so profound that it makes up one of the fundamental seven patterns of AI: the conversation and human interaction pattern.