AITopics | agreement score

Collaborating Authors

agreement score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

What Knowledge Gets Distilled in Knowledge Distillation? Utkarsh Ojha Yuheng Li Anirudh Sundara Rajan Yingyu Liang Yong Jae Lee University of Wisconsin-Madison

Neural Information Processing SystemsApr-25-2026, 22:47:11 GMT

Knowledge distillation aims to transfer useful information from a teacher network to a student network, with the primary goal of improving the student's performance for the task at hand. Over the years, there has a been a deluge of novel techniques and use cases of knowledge distillation. Yet, despite the various improvements, there seems to be a glaring gap in the community's fundamental understanding of the process. Specifically, what is the knowledge that gets distilled in knowledge distillation? In other words, in what ways does the student become similar to the teacher?

artificial intelligence, machine learning, student, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.40)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Add feedback

608fe7e32f7b773545cc1d656a0fdc98-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 03:25:12 GMT

draft model, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States (0.04)
Europe > Iceland > Capital Region > Reykjavik (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)

Add feedback

DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning

Sivakumaran, Nithin, Chen, Justin Chih-Yao, Wan, David, Zhang, Yue, Yoon, Jaehong, Stengel-Eskin, Elias, Bansal, Mohit

arXiv.org Artificial IntelligenceDec-9-2025

Specialized visual tools can augment large language models or vision language models with expert knowledge (e.g., grounding, spatial reasoning, medical knowledge, etc.), but knowing which tools to call (and when to call them) can be challenging. We introduce DART, a multi-agent framework that uses disagreements between multiple debating visual agents to identify useful visual tools (e.g., object detection, OCR, spatial reasoning, etc.) that can resolve inter-agent disagreement. These tools allow for fruitful multi-agent discussion by introducing new information, and by providing tool-aligned agreement scores that highlight agents in agreement with expert tools, thereby facilitating discussion. We utilize an aggregator agent to select the best answer by providing the agent outputs and tool information. We test DART on four diverse benchmarks and show that our approach improves over multi-agent debate as well as over single agent tool-calling frameworks, beating the next-strongest baseline (multi-agent debate with a judge model) by 3.4% and 2.4% on A-OKVQA and MMMU respectively. We also find that DART adapts well to new tools in applied domains, with a 1.3% improvement on the M3D medical dataset over other strong tool-calling, single agent, and multi-agent baselines. Additionally, we measure text overlap across rounds to highlight the rich discussion in DART compared to existing multi-agent methods. Finally, we study the tool call distribution, finding that diverse tools are reliably used to help resolve disagreement.

agent, artificial intelligence, reasoning, (18 more...)

arXiv.org Artificial Intelligence

2512.07132

Country:

North America > United States (0.46)
Asia (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Don't Reach for the Stars: Rethinking Topology for Resilient Federated Learning

Konstantin, Mirko, Mukhopadhyay, Anirban

arXiv.org Artificial IntelligenceNov-25-2025

Federated learning (FL) enables collaborative model training across distributed clients while preserving data privacy by keeping data local. Traditional FL approaches rely on a centralized, star-shaped topology, where a central server aggregates model updates from clients. However, this architecture introduces several limitations, including a single point of failure, limited personalization, and poor robustness to distribution shifts or vulnerability to malfunctioning clients. Moreover, update selection in centralized FL often relies on low-level parameter differences, which can be unreliable when client data is not independent and identically distributed, and offer clients little control. In this work, we propose a decentralized, peer-to-peer (P2P) FL framework. It leverages the flexibility of the P2P topology to enable each client to identify and aggregate a personalized set of trustworthy and beneficial updates.This framework is the Local Inference Guided Aggregation for Heterogeneous Training Environments to Yield Enhancement Through Agreement and Regularization (LIGHTYEAR). Central to our method is an agreement score, computed on a local validation set, which quantifies the semantic alignment of incoming updates in the function space with respect to the clients reference model. Each client uses this score to select a tailored subset of updates and performs aggregation with a regularization term that further stabilizes the training. Our empirical evaluation across five datasets shows that the proposed approach consistently outperforms both, centralized baselines and existing P2P methods in terms of client-level performance, particularly under adversarial and heterogeneous conditions.

artificial intelligence, machine learning, malfunction, (17 more...)

arXiv.org Artificial Intelligence

2508.05224

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

ProMediate: A Socio-cognitive framework for evaluating proactive agents in multi-party negotiation

Liu, Ziyi, Sarrafzadeh, Bahar, Zhou, Pei, Yang, Longqi, Zhao, Jieyu, Sharma, Ashish

arXiv.org Artificial IntelligenceOct-30-2025

While Large Language Models (LLMs) are increasingly used in agentic frameworks to assist individual users, there is a growing need for agents that can proactively manage complex, multi-party collaboration. Systematic evaluation methods for such proactive agents remain scarce, limiting progress in developing AI that can effectively support multiple people together. Negotiation offers a demanding testbed for this challenge, requiring socio-cognitive intelligence to navigate conflicting interests between multiple participants and multiple topics and build consensus. Here, we present ProMediate, the first framework for evaluating proactive AI mediator agents in complex, multi-topic, multi-party negotiations. ProMediate consists of two core components: (i) a simulation testbed based on realistic negotiation cases and theory-driven difficulty levels (ProMediate-Easy, ProMediate-Medium, and ProMediate-Hard), with a plug-and-play proactive AI mediator grounded in socio-cognitive mediation theories, capable of flexibly deciding when and how to intervene; and (ii) a socio-cognitive evaluation framework with a new suite of metrics to measure consensus changes, intervention latency, mediator effectiveness, and intelligence. Together, these components establish a systematic framework for assessing the socio-cognitive intelligence of proactive AI agents in multi-party settings. Our results show that a socially intelligent mediator agent outperforms a generic baseline, via faster, better-targeted interventions. In the ProMediate-Hard setting, our social mediator increases consensus change by 3.6 percentage points compared to the generic baseline (10.65\% vs 7.01\%) while being 77\% faster in response (15.98s vs. 3.71s). In conclusion, ProMediate provides a rigorous, theory-grounded testbed to advance the development of proactive, socially intelligent agents.

artificial intelligence, mediator, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.25224

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

MMPersuade: A Dataset and Evaluation Framework for Multimodal Persuasion

Qiu, Haoyi, Zhou, Yilun, Venkit, Pranav Narayanan, Huang, Kung-Hsiang, Zhang, Jiaxin, Peng, Nanyun, Wu, Chien-Sheng

arXiv.org Artificial IntelligenceOct-28-2025

As Large Vision-Language Models (LVLMs) are increasingly deployed in domains such as shopping, health, and news, they are exposed to pervasive persuasive content. A critical question is how these models function as persuadees-how and why they can be influenced by persuasive multimodal inputs. Understanding both their susceptibility to persuasion and the effectiveness of different persuasive strategies is crucial, as overly persuadable models may adopt misleading beliefs, override user preferences, or generate unethical or unsafe outputs when exposed to manipulative messages. We introduce MMPersuade, a unified framework for systematically studying multimodal persuasion dynamics in LVLMs. MMPersuade contributes (i) a comprehensive multimodal dataset that pairs images and videos with established persuasion principles across commercial, subjective and behavioral, and adversarial contexts, and (ii) an evaluation framework that quantifies both persuasion effectiveness and model susceptibility via third-party agreement scoring and self-estimated token probabilities on conversation histories. Our study of six leading LVLMs as persuadees yields three key insights: (i) multimodal inputs substantially increase persuasion effectiveness-and model susceptibility-compared to text alone, especially in misinformation scenarios; (ii) stated prior preferences decrease susceptibility, yet multimodal information maintains its persuasive advantage; and (iii) different strategies vary in effectiveness across contexts, with reciprocity being most potent in commercial and subjective contexts, and credibility and logic prevailing in adversarial contexts. By jointly analyzing persuasion effectiveness and susceptibility, MMPersuade provides a principled foundation for developing models that are robust, preference-consistent, and ethically aligned when engaging with persuasive multimodal content.

large language model, machine learning, persuasion, (22 more...)

arXiv.org Artificial Intelligence

2510.22768

Country: North America > United States > California (0.46)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.87)

Industry:

Information Technology (1.00)
Health & Medicine > Consumer Health (1.00)
Government (1.00)
(5 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Scalable multilingual PII annotation for responsible AI in LLMs

Meena, Bharti, Skubisz, Joanna, Rajgarhia, Harshit, Dave, Nand, Ganesh, Kiran, Dalmia, Shivali, Mukherji, Abhishek, Sundarababu, Vasudevan

arXiv.org Artificial IntelligenceOct-13-2025

Abstract--As Large Language Models (LLMs) gain wider adoption, ensuring their reliable handling of Personally Identifiable Information (PII) across diverse regulatory contexts has become essential. This work introduces a scalable multilingual data curation framework designed for high-quality PII annotation across 13 underrepresented locales (Table I), covering approximately 336 locale-specific PII types. Our phased, human-in-the-loop annotation methodology combines linguistic expertise with rigorous quality assurance, leading to substantial improvements in recall and false positive rates from pilot, training, and production phases. Beyond reporting empirical gains, we highlight common annotator challenges in multilingual PII labeling and demonstrate how iterative, analytics-driven pipelines can enhance both annotation quality and downstream model reliability. I. Introduction A. PII Data Protection The surge in user-generated content has led to vast textual corpora containing hidden instances of Personally Identifiable Information (PII) in application forms, support tickets, reviews and social media posts [1]. PII--such as NAME, SSN, and PHONE NUMBER--poses significant privacy risks if not handled correctly. Its compromise can result in identity theft, financial fraud, and unauthorized access to sensitive data [2].

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.0625

Country:

Europe (1.00)
Asia (0.68)
North America > United States (0.46)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.90)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling

Neural Information Processing SystemsOct-10-2025, 04:12:07 GMT

Touvron et al., 2023; Jiang et al., 2023) has become a central theme of research.

algorithm, draft model, probe, (14 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States (0.04)
Europe > Iceland > Capital Region > Reykjavik (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)

Add feedback

Are LLM Agents Behaviorally Coherent? Latent Profiles for Social Simulation

Mooney, James, Woldense, Josef, Jia, Zheng Robert, Hayati, Shirley Anugrah, Nguyen, My Ha, Raheja, Vipul, Kang, Dongyeop

arXiv.org Artificial IntelligenceSep-9-2025

The impressive capabilities of Large Language Models (LLMs) have fueled the notion that synthetic agents can serve as substitutes for real participants in human-subject research. In an effort to evaluate the merits of this claim, social science researchers have largely focused on whether LLM-generated survey data corresponds to that of a human counterpart whom the LLM is prompted to represent. In contrast, we address a more fundamental question: Do agents maintain internal consistency, retaining similar behaviors when examined under different experimental settings? To this end, we develop a study designed to (a) reveal the agent's internal state and (b) examine agent behavior in a basic dialogue setting. This design enables us to explore a set of behavioral hypotheses to assess whether an agent's conversation behavior is consistent with what we would expect from their revealed internal state. Our findings on these hypotheses show significant internal inconsistencies in LLMs across model families and at differing model sizes. Most importantly, we find that, although agents may generate responses matching those of their human counterparts, they fail to be internally consistent, representing a critical gap in their capabilities to accurately substitute for real participants in human-subject research. Our simulation code and data are publicly accessible.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.03736

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Law (0.93)
Government > Tax (0.93)
Health & Medicine (0.69)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Filters

Collaborating Authors

agreement score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

What Knowledge Gets Distilled in Knowledge Distillation? Utkarsh Ojha Yuheng Li Anirudh Sundara Rajan Yingyu Liang Yong Jae Lee University of Wisconsin-Madison

608fe7e32f7b773545cc1d656a0fdc98-Paper-Conference.pdf

2433fec2144ccf5fea1c9c5ebdbc3924-Paper-Conference.pdf

DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning

Don't Reach for the Stars: Rethinking Topology for Resilient Federated Learning

ProMediate: A Socio-cognitive framework for evaluating proactive agents in multi-party negotiation

MMPersuade: A Dataset and Evaluation Framework for Multimodal Persuasion

Scalable multilingual PII annotation for responsible AI in LLMs

Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling

Are LLM Agents Behaviorally Coherent? Latent Profiles for Social Simulation