Agents
Incremental Summarization for Customer Support via Progressive Note-Taking and Agent Feedback
Wu, Yisha, Zhao, Cen Mia, Cao, Yuanpei, Su, Xiaoqing, Mehdad, Yashar, Ji, Mindy, Cheng, Claire Na
We introduce an incremental summarization system for customer support agents that intelligently determines when to generate concise bullet notes during conversations, reducing agents' context-switching effort and redundant review. Our approach combines a fine-tuned Mixtral-8x7B model for continuous note generation with a DeBERTa-based classifier to filter trivial content. Agent edits refine the online notes generation and regularly inform offline model retraining, closing the agent edits feedback loop. Deployed in production, our system achieved a 3% reduction in case handling time compared to bulk summarization (with reductions of up to 9% in highly complex cases), alongside high agent satisfaction ratings from surveys. These results demonstrate that incremental summarization with continuous feedback effectively enhances summary quality and agent productivity at scale.
Paper2Video: Automatic Video Generation from Scientific Papers
Zhu, Zeyu, Lin, Kevin Qinghong, Shou, Mike Zheng
Academic presentation videos have become an essential medium for research communication, yet producing them remains highly labor-intensive, often requiring hours of slide design, recording, and editing for a short 2 to 10 minutes video. Unlike natural video, presentation video generation involves distinctive challenges: inputs from research papers, dense multi-modal information (text, figures, tables), and the need to coordinate multiple aligned channels such as slides, subtitles, speech, and human talker. To address these challenges, we introduce Paper2Video, the first benchmark of 101 research papers paired with author-created presentation videos, slides, and speaker metadata. We further design four tailored evaluation metrics--Meta Similarity, PresentArena, PresentQuiz, and IP Memory--to measure how videos convey the paper's information to the audience. Building on this foundation, we propose PaperTalker, the first multi-agent framework for academic presentation video generation. It integrates slide generation with effective layout refinement by a novel effective tree search visual choice, cursor grounding, subtitling, speech synthesis, and talking-head rendering, while parallelizing slide-wise generation for efficiency. Experiments on Paper2Video demonstrate that the presentation videos produced by our approach are more faithful and informative than existing baselines, establishing a practical step toward automated and ready-to-use academic video generation. Our dataset, agent, and code are available at https://github.com/showlab/Paper2Video.
Machine-Learning Driven Load Shedding to Mitigate Instability Attacks in Power Grids
Tackett, Justin, Francis, Benjamin, Garcia, Luis, Grimsman, David, Warnick, Sean
Abstract--Critical infrastructures are becoming increasingly complex as our society becomes increasingly dependent on them. This complexity opens the door to new possibilities for attacks and a need for new defense strategies. Our work focuses on instability attacks on the power grid, wherein an attacker causes cascading outages by introducing unstable dynamics into the system. When stress is place on the power grid, a standard mitigation approach is load-shedding: the system operator chooses a set of loads to shut off until the situation is resolved. While this technique is standard, there is no systematic approach to choosing which loads will stop an instability attack. We show a proof of concept on the IEEE 14 Bus System using the Achilles Heel T echnologies Power Grid Analyzer, and show through an implementation of modified Prony analysis (MPA) that MPA is a viable method for detecting instability attacks and triggering defense mechanisms. Throughout the past two hundred years, the power grid has become a core part of the infrastructure of the world. Every modern facility relies on electricity to sustain the way of life that has become prevalent in first world countries, powering everything from life sustaining equipment to financial transaction infrastructure.
Using utility graphs to search for Pareto-optimal outcomes in complex, interdependent issue negotiations
Negotiation is a powerful tool for modelling complex interactions between self - interested agents, which can be people, companies or increasingly, AI - enabled autonomous agents, that aim to reach the best agreement for their human owners. While negotiation is often thought as a competitive process, in which one part y wins and the other one l oses, in practice most real negotiations involve more complex, win - win scenarios ( Raif fa [20]), in which agreements can be found that maximize the utilities of both agents . S uch outcomes (agreements) are called Pareto - efficient, i.e. it is not possible to find another outcome that would increase one agent's utility, without making another agent worse off. Yet, finding agreements that are Pareto - efficient is a challenging computational problem, especially in complex negotiation domains, where issues negotiated upon are interdependent (i.e. the utility of the value chosen for one negotiation issue depends strongly on the choice for other one s). Consider, for example, the negotiations between parties in a logistic supply chain: producers want to have certain combinations of resources/quantities, delivered at certain times to be able to produce their goods, whil e suppliers may face similar constraints in their cost function for supplying different combinations of items . Or the peer - to - peer negotiations between prosumers in a decentralised power grid, that require certain amounts of energy at different times and locations, which involve non - linear constraints, especially if the capacity of the distribution network is limited .
CoCoA: Collaborative Chain-of-Agents for Parametric-Retrieved Knowledge Synergy
Jiang, Yi, Zhao, Sendong, Li, Jianbo, Wang, Haochun, Zhang, Lizhe, Liu, Yan, Qin, Bing
Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs), especially for knowledge-intensive tasks. Despite its advantages, current RAG methods often struggle to fully exploit knowledge during generation. In particular, the synergy between the model's internal parametric knowledge and external retrieved knowledge remains limited. Retrieved contents may sometimes mislead generation, while certain generated content can guide the model toward more accurate outputs. In this work, we propose Collaborative Chain-of-Agents, a framework designed to enhance explicitly synergy over both parametric and retrieved knowledge. Specifically, we first introduce CoCoA-zero, a multi-agent RAG framework that first performs conditional knowledge induction and then reasons answers. Building on this, we develop CoCoA, a long-chain training strategy that synthesizes extended multi-agent reasoning trajectories from CoCoA-zero to fine-tune the LLM. This strategy enhances the model's capability to explicitly integrate and jointly leverage parametric and retrieved knowledge. Experimental results demonstrate the superiority of CoCoA in open-domain QA and multi-hop QA.
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning
Sun, Yu, Qian, Xingyu, Xu, Weiwen, Zhang, Hao, Xiao, Chenghao, Li, Long, Zhao, Deli, Huang, Wenbing, Xu, Tingyang, Bai, Qifeng, Rong, Yu
Reasoning-based large language models have excelled in mathematics and programming, yet their potential in knowledge-intensive medical question answering remains underexplored and insufficiently validated in clinical contexts. To bridge this gap, we introduce ReasonMed, the largest medical reasoning dataset to date, comprising 370k high-quality examples distilled from 1.75 million initial reasoning paths generated by complementary LLMs and curated through a cost-efficient easy-medium-difficult (EMD) pipeline. ReasonMed is built through a multi-agent generation, verification, and refinement process, in which an Error Refiner improves reasoning paths by correcting error-prone steps identified by a verifier. Using ReasonMed, we investigate effective strategies for training medical reasoning models and find that integrating detailed CoT reasoning with concise answer summaries yields the most robust fine-tuning results. Models trained on ReasonMed set a new benchmark: ReasonMed-7B surpasses the prior best sub-10B models by 4.17% and even exceeds LLaMA3.1-70B on PubMedQA by 4.60%. When scaled to ReasonMed-14B, it remains highly competitive, underscoring consistent scaling potential. The codes and datasets are available at https://github.com/YuSun-Work/ReasonMed.