Goto

Collaborating Authors

 collaborative


Teaching AI to Feel: A Collaborative, Full-Body Exploration of Emotive Communication

Tütüncü, Esen K., Lemus, Lissette, Pilcher, Kris, Sprengel, Holger, Sabater-Mir, Jordi

arXiv.org Artificial Intelligence

Commonaiverse is an interactive installation exploring human emotions through full-body motion tracking and real-time AI feedback. Participants engage in three phases: Teaching, Exploration and the Cosmos Phase, collaboratively expressing and interpreting emotions with the system. The installation integrates MoveNet for precise motion tracking and a multi-recommender AI system to analyze emotional states dynamically, responding with adaptive audiovisual outputs. By shifting from top-down emotion classification to participant-driven, culturally diverse definitions, we highlight new pathways for inclusive, ethical affective computing. We discuss how this collaborative, out-of-the-box approach pushes multimedia research beyond single-user facial analysis toward a more embodied, co-created paradigm of emotional AI. Furthermore, we reflect on how this reimagined framework fosters user agency, reduces bias, and opens avenues for advanced interactive applications.


One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework

Li, Feiran, Xu, Qianqian, Bao, Shilong, Yang, Zhiyong, Cao, Xiaochun, Huang, Qingming

arXiv.org Artificial Intelligence

Concept erasing has recently emerged as an effective paradigm to prevent text-to-image diffusion models from generating visually undesirable or even harmful content. However, current removal methods heavily rely on manually crafted text prompts, making it challenging to achieve a high erasure (efficacy) while minimizing the impact on other benign concepts (usability). In this paper, we attribute the limitations to the inherent gap between the text and image modalities, which makes it hard to transfer the intricately entangled concept knowledge from text prompts to the image generation process. To address this, we propose a novel solution by directly integrating visual supervision into the erasure process, introducing the first text-image Collaborative Concept Erasing (Co-Erasing) framework. Specifically, Co-Erasing describes the concept jointly by text prompts and the corresponding undesirable images induced by the prompts, and then reduces the generating probability of the target concept through negative guidance. This approach effectively bypasses the knowledge gap between text and image, significantly enhancing erasure efficacy. Additionally, we design a text-guided image concept refinement strategy that directs the model to focus on visual features most relevant to the specified text concept, minimizing disruption to other benign concepts. Finally, comprehensive experiments suggest that Co-Erasing outperforms state-of-the-art erasure approaches significantly with a better trade-off between efficacy and usability. Codes are available at https://github.com/Ferry-Li/Co-Erasing.


Comparing Native and Non-native English Speakers' Behaviors in Collaborative Writing through Visual Analytics

Chen, Yuexi, Xiao, Yimin, Zinat, Kazi Tasnim, Yamashita, Naomi, Gao, Ge, Liu, Zhicheng

arXiv.org Artificial Intelligence

Understanding collaborative writing dynamics between native speakers (NS) and non-native speakers (NNS) is critical for enhancing collaboration quality and team inclusivity. In this paper, we partnered with communication researchers to develop visual analytics solutions for comparing NS and NNS behaviors in 162 writing sessions across 27 teams. The primary challenges in analyzing writing behaviors are data complexity and the uncertainties introduced by automated methods. In response, we present \textsc{COALA}, a novel visual analytics tool that improves model interpretability by displaying uncertainties in author clusters, generating behavior summaries using large language models, and visualizing writing-related actions at multiple granularities. We validated the effectiveness of \textsc{COALA} through user studies with domain experts (N=2+2) and researchers with relevant experience (N=8). We present the insights discovered by participants using \textsc{COALA}, suggest features for future AI-assisted collaborative writing tools, and discuss the broader implications for analyzing collaborative processes beyond writing.


Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights

Kahana, Jonathan, Nathan, Or, Horwitz, Eliahu, Hoshen, Yedid

arXiv.org Artificial Intelligence

With the increasing numbers of publicly available models, there are probably pretrained, online models for most tasks users require. However, current model search methods are rudimentary, essentially a text-based search in the documentation, thus users cannot find the relevant models. This paper presents ProbeLog, a method for retrieving classification models that can recognize a target concept, such as "Dog", without access to model metadata or training data. Differently from previous probing methods, ProbeLog computes a descriptor for each output dimension (logit) of each model, by observing its responses on a fixed set of inputs (probes). Our method supports both logit-based retrieval ("find more logits like this") and zero-shot, text-based retrieval ("find all logits corresponding to dogs"). As probing-based representations require multiple costly feedforward passes through the model, we develop a method, based on collaborative filtering, that reduces the cost of encoding repositories by 3x. We demonstrate that ProbeLog achieves high retrieval accuracy, both in real-world and fine-grained search tasks and is scalable to full-size repositories.


Interactive Sketchpad: An Interactive Multimodal System for Collaborative, Visual Problem-Solving

Chen, Steven-Shine, Lee, Jimin, Liang, Paul Pu

arXiv.org Artificial Intelligence

Humans have long relied on visual aids like sketches and diagrams to support reasoning and problem-solving. Visual tools, like auxiliary lines in geometry or graphs in calculus, are essential for understanding complex ideas. However, many tutoring systems remain text-based, providing feedback only through natural language. Leveraging recent advances in Large Multimodal Models (LMMs), this paper introduces Interactive Sketchpad, a tutoring system that combines language-based explanations with interactive visualizations to enhance learning. Built on a pre-trained LMM, Interactive Sketchpad is fine-tuned to provide step-by-step guidance in both text and visuals, enabling natural multimodal interaction with the student. Accurate and robust diagrams are generated by incorporating code execution into the reasoning process. User studies conducted on math problems such as geometry, calculus, and trigonometry demonstrate that Interactive Sketchpad leads to improved task comprehension, problem-solving accuracy, and engagement levels, highlighting its potential for transforming educational technologies.


Collaborative filtering based on nonnegative/binary matrix factorization

Terui, Yukino, Inoue, Yuka, Hamakawa, Yohei, Tatsumura, Kosuke, Kudo, Kazue

arXiv.org Artificial Intelligence

Collaborative filtering generates recommendations based on user-item similarities through rating data, which may involve numerous unrated items. To predict scores for unrated items, matrix factorization techniques, such as nonnegative matrix factorization (NMF), are often employed to predict scores for unrated items. Nonnegative/binary matrix factorization (NBMF), which is an extension of NMF, approximates a nonnegative matrix as the product of nonnegative and binary matrices. Previous studies have employed NBMF for image analysis where the data were dense. In this paper, we propose a modified NBMF algorithm that can be applied to collaborative filtering where data are sparse. In the modified method, unrated elements in a rating matrix are masked, which improves the collaborative filtering performance. Utilizing a low-latency Ising machine in NBMF is advantageous in terms of the computation time, making the proposed method beneficial.


Wireless Resource Allocation with Collaborative Distributed and Centralized DRL under Control Channel Attacks

Wang, Ke, Liu, Wanchun, Lim, Teng Joon

arXiv.org Artificial Intelligence

In this paper, we consider a wireless resource allocation problem in a cyber-physical system (CPS) where the control channel, carrying resource allocation commands, is subjected to denial-of-service (DoS) attacks. We propose a novel concept of collaborative distributed and centralized (CDC) resource allocation to effectively mitigate the impact of these attacks. To optimize the CDC resource allocation policy, we develop a new CDC-deep reinforcement learning (DRL) algorithm, whereas existing DRL frameworks only formulate either centralized or distributed decision-making problems. Simulation results demonstrate that the CDC-DRL algorithm significantly outperforms state-of-the-art DRL benchmarks, showcasing its ability to address resource allocation problems in large-scale CPSs under control channel attacks.


How Today's Recommender Systems Use Machine Learning to Cater to Your Every Whim

Communications of the ACM

Whether they recommend products, offers, or content, all recommender systems ultimately determine what makes you more or less compatible with an item or piece of content, according to Julian McAuley, a professor of computer science at University of California San Diego. "More elaborate models leverage machine learning and capture temporal dynamics and changing user context," said McAuley. "But the core idea is the same: they use historical interactions to learn which users and items are similar to each other." They use different approaches to accomplish that. Some recommender systems are content-based systems, examining the properties of different items or pieces of content, explained Dinesh Gauri, Walmart Chair of Marketing at the University of Arkansas.


A Neural Matrix Decomposition Recommender System Model based on the Multimodal Large Language Model

Xiang, Ao, Huang, Bingjie, Guo, Xinyu, Yang, Haowei, Zheng, Tianyao

arXiv.org Artificial Intelligence

The challenge of finding content that aligns with users' interests within this abundance has become increasingly important. Recommender systems play a crucial role in addressing this issue, as they have the potential to provide precise recommendations that enhance user experience and save time in commercial applications [1]. These systems predict user ratings for specific items by employing data mining techniques and related predictive algorithms to make highly relevant predictions. By analyzing user historical behavior, preferences, and item characteristics, recommender systems effectively solve the information filtering problem by automatically matching items that may be of interest to users. Traditional recommender systems primarily consist of collaborative filtering [2], content-based recommendations [3], and hybrid recommendation methods, among which collaborative filtering is one of the earliest and most widely used techniques for recommending products or items based on past purchasing history.


A Collaborative, Human-Centred Taxonomy of AI, Algorithmic, and Automation Harms

Abercrombie, Gavin, Benbouzid, Djalel, Giudici, Paolo, Golpayegani, Delaram, Hernandez, Julio, Noro, Pierre, Pandit, Harshvardhan, Paraschou, Eva, Pownall, Charlie, Prajapati, Jyoti, Sayre, Mark A., Sengupta, Ushnish, Suriyawongkul, Arthit, Thelot, Ruby, Vei, Sofia, Waltersdorfer, Laura

arXiv.org Artificial Intelligence

This paper introduces a collaborative, human-centered taxonomy of AI, algorithmic and automation harms. We argue that existing taxonomies, while valuable, can be narrow, unclear, typically cater to practitioners and government, and often overlook the needs of the wider public. Drawing on existing taxonomies and a large repository of documented incidents, we propose a taxonomy that is clear and understandable to a broad set of audiences, as well as being flexible, extensible, and interoperable. Through iterative refinement with topic experts and crowdsourced annotation testing, we propose a taxonomy that can serve as a powerful tool for civil society organisations, educators, policymakers, product teams and the general public. By fostering a greater understanding of the real-world harms of AI and related technologies, we aim to increase understanding, empower NGOs and individuals to identify and report violations, inform policy discussions, and encourage responsible technology development and deployment.