AITopics

This paper presents ThinkTank, a comprehensive and scalable framework designed to transform specialized AI agent systems into versatile collaborative intelligence platforms capable of supporting complex problem-solving across diverse domains. ThinkTank systematically generalizes agent roles, meeting structures, and knowledge integration mechanisms by adapting proven scientific collaboration methodologies. Through role abstraction, generalization of meeting types for iterative collaboration, and the integration of Retrieval-Augmented Generation with advanced knowledge storage, the framework facilitates expertise creation and robust knowledge sharing. ThinkTank enables organizations to leverage collaborative AI for knowledge-intensive tasks while ensuring data privacy and security through local deployment, utilizing frameworks like Ollama with models such as Llama3.1. The ThinkTank framework is designed to deliver significant advantages in cost-effectiveness, data security, scalability, and competitive positioning compared to cloud-based alternatives, establishing it as a universal platform for AI-driven collaborative problem-solving. The ThinkTank code is available at https://github.com/taugroup/ThinkTank

artificial intelligence, machine learning, natural language, (18 more...)

2506.02931

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.04)

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Designing Algorithmic Delegates: The Role of Indistinguishability in Human-AI Handoff

Greenwood, Sophie, Levy, Karen, Barocas, Solon, Heidari, Hoda, Kleinberg, Jon

As AI technologies improve, people are increasingly willing to delegate tasks to AI agents. In many cases, the human decision-maker chooses whether to delegate to an AI agent based on properties of the specific instance of the decision-making problem they are facing. Since humans typically lack full awareness of all the factors relevant to this choice for a given decision-making instance, they perform a kind of categorization by treating indistinguishable instances -- those that have the same observable features -- as the same. In this paper, we define the problem of designing the optimal algorithmic delegate in the presence of categories. This is an important dimension in the design of algorithms to work with humans, since we show that the optimal delegate can be an arbitrarily better teammate than the optimal standalone algorithmic agent. The solution to this optimal delegation problem is not obvious: we discover that this problem is fundamentally combinatorial, and illustrate the complex relationship between the optimal design and the properties of the decision-making task even in simple settings. Indeed, we show that finding the optimal delegate is computationally hard in general. However, we are able to find efficient algorithms for producing the optimal delegate in several broad cases of the problem, including when the optimal action may be decomposed into functions of features observed by the human and the algorithm. Finally, we run computational experiments to simulate a designer updating an algorithmic delegate over time to be optimized for when it is actually adopted by users, and show that while this process does not recover the optimal delegate in general, the resulting delegate often performs quite well.

artificial intelligence, delegate, machine learning, (16 more...)

2506.03102

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Santa Clara County > Stanford (0.06)
North America > United States > New York > Tompkins County > Ithaca (0.04)
(7 more...)

Genre:

Research Report (0.64)
Workflow (0.45)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Rohr, Edward H., Nardini, John T.

A novel sensitivity analysis method for agent-based models stratifies in-silico tumor spheroid simulations

arXiv.org Machine LearningJun-4-2025

Agent-based models (ABMs) are widely used in biology to understand how individual actions scale into emergent population behavior. Modelers employ sensitivity analysis (SA) algorithms to quantify input parameters' impact on model outputs, however, it is hard to perform SA for ABMs due to their computational and complex nature. In this work, we develop the Simulate, Summarize, Reduce, Cluster, and Analyze (SSRCA) methodology, a machine-learning based pipeline designed to facilitate SA for ABMs. In particular, SSRCA can achieve the following tasks for ABMS: 1) identify sensitive model parameters, 2) reveal common output model patterns, and 3) determine which input parameter values generate these patterns. We use an example ABM of tumor spheroid growth to showcase how SSRCA provides similar SA results to the popular Sobol' Method while also identifying four common patterns from the ABM and the parameter regions that generate these outputs. This analysis could streamline data-driven tasks, such as parameter estimation, for ABMs by reducing parameter space. While we highlight these results with an ABM on tumor spheroid formation, the SSRCA methodology is broadly applicable to biological ABMs.

artificial intelligence, machine learning, simulation, (17 more...)

arXiv.org Machine Learning

2506.00168

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > New Jersey > Mercer County > Ewing (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

arXiv.org Machine LearningJun-4-2025

Multi-Metric Adaptive Experimental Design under Fixed Budget with Validation

Zhang, Qining, Fiez, Tanner, Liu, Yi, Liu, Wenyang

Standard A/B tests in online experiments face statistical power challenges when testing multiple candidates simultaneously, while adaptive experimental designs (AED) alone fall short in inferring experiment statistics such as the average treatment effect, especially with many metrics (e.g., revenue, safety) and heterogeneous variances. This paper proposes a fixed-budget multi-metric AED framework with a two-phase structure: an adaptive exploration phase to identify the best treatment, and a validation phase with an A/B test to verify the treatment's quality and infer statistics. We propose SHRVar, which generalizes sequential halving (SH) (Karnin et al., 2013) with a novel relative-variance-based sampling and an elimination strategy built on reward z-values. It achieves a provable error probability that decreases exponentially, where the exponent generalizes the complexity measure for SH (Karnin et al., 2013) and SHVar (Lalitha et al., 2023) with homogeneous and heterogeneous variances, respectively. Numerical experiments verify our analysis and demonstrate the superior performance of this new framework.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2506.03062

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Germany > Hesse > Darmstadt Region > Wiesbaden (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(7 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.47)
(2 more...)

Dynamic real-time multi-UAV cooperative mission planning method under multiple constraints

Liu, Chenglou, Lu, Yufeng, Xie, Fangfang, Ji, Tingwei, Zheng, Yao

As UAV popularity soars, so does the mission planning associated with it. The classical approaches suffer from the triple problems of decoupled of task assignment and path planning, poor real-time performance and limited adaptability. Aiming at these challenges, this paper proposes a dynamic real-time multi-UAV collaborative mission planning algorithm based on Dubins paths under a distributed formation structure. Dubins path with multiple advantages bridges the gap between task assignment and path planning, leading to a coupled solution for mission planning. Then, a series of acceleration techniques, task clustering preprocessing, highly efficient distance cost functions, low-complexity and less iterative task allocation strategies, are employed to guarantee the real-time performance of the algorithms. To cope with different emergencies and their simultaneous extremes, real-time planning of emerging tasks and mission replanning due to the reduction of available UAVs are appropriately handled. Finally, the developed algorithm is comprehensively exemplified and studied through simulations, highlighting that the proposed method only sacrifices 9.57% of the path length, while achieving a speed improvement of 4-5 orders of magnitude over the simulated annealing method, with a single mission planning of about 0.0003s.

evolutionary algorithm, machine learning, real time system, (20 more...)

2506.02365

Genre: Research Report (0.82)

Industry: Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(3 more...)

Siingh, Shikhhar, Rawat, Abhinav, Baral, Chitta, Gupta, Vivek

GETReason: Enhancing Image Context Extraction through Hierarchical Multi-Agent Reasoning

Publicly significant images from events hold valuable contextual information, crucial for journalism and education. However, existing methods often struggle to extract this relevance accurately. To address this, we introduce GETReason (Geospatial Event Temporal Reasoning), a framework that moves beyond surface-level image descriptions to infer deeper contextual meaning. We propose that extracting global event, temporal, and geospatial information enhances understanding of an image's significance. Additionally, we introduce GREAT (Geospatial Reasoning and Event Accuracy with Temporal Alignment), a new metric for evaluating reasoning-based image understanding. Our layered multi-agent approach, assessed using a reasoning-weighted metric, demonstrates that meaningful insights can be inferred, effectively linking images to their broader event context.

large language model, machine learning, natural language, (17 more...)

2505.21863

Country: Asia > Indonesia (0.69)

Genre: Research Report (0.64)

Industry:

Government (1.00)
Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

The Hitchhikers Guide to Production-ready Trustworthy Foundation Model powered Software (FMware)

Vasilevski, Kirill, Rombaut, Benjamin, Rajbahadur, Gopi Krishnan, Oliva, Gustavo A., Gallaba, Keheliya, Cogo, Filipe R., Lin, Jiahuei, Lin, Dayi, Zhang, Haoxiang, Chen, Bouyan, Thangarajah, Kishanthan, Hassan, Ahmed E., Jiang, Zhen Ming

Foundation Models (FMs) such as Large Language Models (LLMs) are reshaping the software industry by enabling FMware, systems that integrate these FMs as core components. In this KDD 2025 tutorial, we present a comprehensive exploration of FMware that combines a curated catalogue of challenges with real-world production concerns. We first discuss the state of research and practice in building FMware. We further examine the difficulties in selecting suitable models, aligning high-quality domain-specific data, engineering robust prompts, and orchestrating autonomous agents. We then address the complex journey from impressive demos to production-ready systems by outlining issues in system testing, optimization, deployment, and integration with legacy software. Drawing on our industrial experience and recent research in the area, we provide actionable insights and a technology roadmap for overcoming these challenges. Attendees will gain practical strategies to enable the creation of trustworthy FMware in the evolving technology landscape.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

doi: 10.1145/3711896.3736572

2505.1064

Country: North America > Canada (0.16)

Genre: Overview (0.93)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning

Feng, Lang, Tan, Weihao, Lyu, Zhiyi, Zheng, Longtao, Xu, Haiyang, Yan, Ming, Huang, Fei, An, Bo

Online fine-tuning vision-language model (VLM) agents with reinforcement learning (RL) has shown promise for equipping agents with multi-step, goal-oriented capabilities in dynamic environments. However, their open-ended textual action space and non-end-to-end nature of action generation present significant challenges to effective online exploration in RL, e.g., explosion of the exploration space. We propose a novel online fine-tuning method, Counterfactual Soft Reinforcement Learning (CoSo), better suited to the textual output space of VLM agents. Compared to prior methods that assign uniform uncertainty to all tokens, CoSo leverages counterfactual reasoning to dynamically assess the causal influence of individual tokens on post-processed actions. By prioritizing the exploration of action-critical tokens while reducing the impact of semantically redundant or low-impact tokens, CoSo enables a more targeted and efficient online rollout process. We provide theoretical analysis proving CoSo's convergence and policy improvement guarantees, and extensive empirical evaluations supporting CoSo's effectiveness. Our results across a diverse set of agent tasks, including Android device control, card gaming, and embodied AI, highlight its remarkable ability to enhance exploration efficiency and deliver consistent performance gains. The code is available at https://github.com/langfengQ/CoSo.

large language model, machine learning, reinforcement learning, (15 more...)

2505.03792

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

DeCo: Defect-Aware Modeling with Contrasting Matching for Optimizing Task Assignment in Online IC Testing

Ting, Lo Pang-Yun, Chiang, Yu-Hao, Tsai, Yi-Tung, Lai, Hsu-Chao, Chuang, Kun-Ta

In the semiconductor industry, integrated circuit (IC) processes play a vital role, as the rising complexity and market expectations necessitate improvements in yield. Identifying IC defects and assigning IC testing tasks to the right engineers improves efficiency and reduces losses. While current studies emphasize fault localization or defect classification, they overlook the integration of defect characteristics, historical failures, and the insights from engineer expertise, which restrains their effectiveness in improving IC handling. To leverage AI for these challenges, we propose DeCo, an innovative approach for optimizing task assignment in IC testing. DeCo constructs a novel defect-aware graph from IC testing reports, capturing co-failure relationships to enhance defect differentiation, even with scarce defect data. Additionally, it formulates defect-aware representations for engineers and tasks, reinforced by local and global structure modeling on the defect-aware graph. Finally, a contrasting-based assignment mechanism pairs testing tasks with QA engineers by considering their skill level and current workload, thus promoting an equitable and efficient job dispatch. Experiments on a real-world dataset demonstrate that DeCo achieves the highest task-handling success rates in different scenarios, exceeding 80\%, while also maintaining balanced workloads on both scarce or expanded defect data. Moreover, case studies reveal that DeCo can assign tasks to potentially capable engineers, even for their unfamiliar defects, highlighting its potential as an AI-driven solution for the real-world IC failure analysis and task handling.

data mining, machine learning, natural language, (16 more...)

2505.00278

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Hardware (0.49)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

TestAgent: An Adaptive and Intelligent Expert for Human Assessment

Yu, Junhao, Zhuang, Yan, Sun, YuXuan, Gao, Weibo, Liu, Qi, Cheng, Mingyue, Huang, Zhenya, Chen, Enhong

Accurately assessing internal human states is key to understanding preferences, offering personalized services, and identifying challenges in real-world applications. Originating from psychometrics, adaptive testing has become the mainstream method for human measurement and has now been widely applied in education, healthcare, sports, and sociology. It customizes assessments by selecting the fewest test questions . However, current adaptive testing methods face several challenges. The mechanized nature of most algorithms leads to guessing behavior and difficulties with open-ended questions. Additionally, subjective assessments suffer from noisy response data and coarse-grained test outputs, further limiting their effectiveness. To move closer to an ideal adaptive testing process, we propose TestAgent, a large language model (LLM)-powered agent designed to enhance adaptive testing through interactive engagement. This is the first application of LLMs in adaptive testing. TestAgent supports personalized question selection, captures test-takers' responses and anomalies, and provides precise outcomes through dynamic, conversational interactions. Experiments on psychological, educational, and lifestyle assessments show our approach achieves more accurate results with 20% fewer questions than state-of-the-art baselines, and testers preferred it in speed, smoothness, and other dimensions.

adaptive testing, large language model, natural language, (17 more...)

2506.03032

Country: Europe > Austria (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)