AITopics

People without a database background usually rely on file systems or tools such as Excel for data management, which often lead to redundancy and data inconsistency. Relational databases possess strong data management capabilities, but require a high level of professional expertise from users. Although there are already many works on Text2SQL to automate the translation of natural language into SQL queries for data manipulation, all of them presuppose that the database schema is pre-designed. In practice, schema design itself demands domain expertise, and research on directly generating schemas from textual requirements remains unexplored. In this paper, we systematically define a new problem, called Text2Schema, to convert a natural language text requirement into a relational database schema. With an effective Text2Schema technique, users can effortlessly create database table structures using natural language, and subsequently leverage existing Text2SQL techniques to perform data manipulations, which significantly narrows the gap between non-technical personnel and highly efficient, versatile relational database systems. We propose SchemaAgent, an LLM-based multi-agent framework for Text2Schema. We emulate the workflow of manual schema design by assigning specialized roles to agents and enabling effective collaboration to refine their respective subtasks. We also incorporate dedicated roles for reflection and inspection, along with an innovative error detection and correction mechanism to identify and rectify issues across various phases. Moreover, we build and open source a benchmark containing 381 pairs of requirement description and schema. Experimental results demonstrate the superiority of our approach over comparative work.

large language model, machine learning, natural language, (18 more...)

2503.23886

Country:

Europe (1.00)
Asia > China (0.29)

Genre:

Overview (1.00)
Research Report > New Finding (0.66)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

PolySkill: Learning Generalizable Skills Through Polymorphic Abstraction

Yu, Simon, Li, Gang, Shi, Weiyan, Qi, Peng

Large language models (LLMs) are moving beyond static uses and are now powering agents that learn continually during their interaction with external environments. For example, agents can learn reusable skills while navigating web pages or toggling new tools. However, existing methods for skill learning often create skills that are over-specialized to a single website and fail to generalize. We introduce PolySkill, a new framework that enables agents to learn generalizable and compositional skills. The core idea, inspired by polymorphism in software engineering, is to decouple a skill's abstract goal (what it accomplishes) and its concrete implementation (how it is executed). Experiments show that our method (1) improves skill reuse by 1.7x on seen websites and (2) boosts success rates by up to 9.4% on Mind2Web and 13.9% on unseen websites, while reducing steps by over 20%. (3) In self-exploration settings without specified tasks, our framework improves the quality of proposed tasks and enables agents to learn generalizable skills that work across different sites. By enabling the agent to identify and refine its own goals, the PolySkill enhances the agent's ability to learn a better curriculum, leading to the acquisition of more generalizable skills compared to baseline methods. This work provides a practical path toward building agents capable of continual learning in adaptive environments. Our findings show that separating a skill's goal from its execution is a crucial step toward developing autonomous agents that can learn and generalize across the open web continuously.

large language model, machine learning, natural language, (21 more...)

2510.15863

Genre: Research Report > New Finding (1.00)

Industry:

Retail (1.00)
Education (1.00)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Schön, Oliver, Haesaert, Sofie, Soudjani, Sadegh

Formal Control for Uncertain Systems via Contract-Based Probabilistic Surrogates (Extended Version)

The requirement for identifying accurate system representations has not only been a challenge to fulfill, but it has compromised the scalability of formal methods, as the resulting models are often too complex for effective decision making with formal correctness and performance guarantees. Focusing on probabilistic simulation relations and surrogate models of stochastic systems, we propose an approach that significantly enhances the scalability and practical applicability of such simulation relations by eliminating the need to compute error bounds directly. As a result, we provide an abstraction-based technique that scales effectively to higher dimensions while addressing complex nonlinear agent-environment interactions with infinite-horizon temporal logic guarantees amidst uncertainty. Our approach trades scalability for conservatism favorably, as demonstrated on a complex high-dimensional vehicle intersection case study.

artificial intelligence, formal contract-based control, relation, (16 more...)

2506.16971

Country: Europe > Germany (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Self-evolving expertise in complex non-verifiable subject domains: dialogue as implicit meta-RL

Bailey, Richard M.

So-called `wicked problems', those involving complex multi-dimensional settings, non-verifiable outcomes, heterogeneous impacts and a lack of single objectively correct answers, have plagued humans throughout history. Modern examples include decisions over justice frameworks, solving environmental pollution, planning for pandemic resilience and food security. The use of state-of-the-art artificial intelligence systems (notably Large Language Model-based agents) collaborating with humans on solving such problems is being actively explored. While the abilities of LLMs can be improved by, for example, fine-tuning, hand-crafted system prompts and scaffolding with external tools, LLMs lack endogenous mechanisms to develop expertise through experience in such settings. This work address this gap with Dialectica, a framework where agents engage in structured dialogue on defined topics, augmented by memory, self-reflection, and policy-constrained context editing. Formally, discussion is viewed as an implicit meta-reinforcement learning process. The `dialogue-trained' agents are evaluated post-hoc using judged pairwise comparisons of elicited responses. Across two model architectures (locally run Qwen3:30b and OpenAI's o4-mini) results show that enabling reflection-based context editing during discussion produces agents which dominate their baseline counterparts on Elo scores, normalized Bradley-Terry-Davidson ability, and AlphaRank mass. The predicted signatures of learning are observed qualitatively in statement and reflection logs, where reflections identify weaknesses and reliably shape subsequent statements. Agreement between quantitative and qualitative evidence supports dialogue-driven context evolution as a practical path to targeted expertise amplification in open non-verifiable domains.

large language model, machine learning, natural language, (21 more...)

2510.15772

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.34)

Industry:

Law > Environmental Law (1.00)
Government (1.00)
Energy (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Grassroots Logic Programs: A Secure, Multiagent, Concurrent, Logic Programming Language

Shapiro, Ehud

Grassroots platforms are distributed applications run by\linebreak cryptographically-identified people on their networked personal devices, where multiple disjoint platform instances emerge independently and coalesce when they interoperate. Their foundation is the grassroots social graph, upon which grassroots social networks, grassroots cryptocurrencies, and grassroots democratic federations can be built. Grassroots platforms have yet to be implemented, the key challenge being faulty and malicious participants: without secure programming support, correct participants cannot reliably identify each other, establish secure communication, or verify each other's code integrity. We present Grassroots Logic Programs (GLP), a secure, multiagent, concurrent, logic programming language for implementing grassroots platforms. GLP extends logic programs with paired single-reader/single-writer (SRSW) logic variables, providing secure communication channels among cryptographically-identified people through encrypted, signed and attested messages, which enable identity and code integrity verification. We present GLP progressively: logic programs, concurrent GLP, multiagent GLP, augmenting it with cryptographic security, and providing smartphone implementation-ready specifications. We prove safety properties including that GLP computations are deductions, SRSW preservation, acyclicity, and monotonicity. We prove multiagent GLP is grassroots and that GLP streams achieve blockchain security properties. We present a grassroots social graph protocol establishing authenticated peer-to-peer connections and demonstrate secure grassroots social networking applications.

artificial intelligence, logic & formal reasoning, logic programming, (17 more...)

2510.15747

Country: Europe (0.27)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Chiris, Lorenzo Satta, Mishra, Ayush

AURA: An Agent Autonomy Risk Assessment Framework

As autonomous agentic AI systems see increasing adoption across organisations, persistent challenges in alignment, governance, and risk management threaten to impede deployment at scale. We present AURA (Agent aUtonomy Risk Assessment), a unified framework designed to detect, quantify, and mitigate risks arising from agentic AI. Building on recent research and practical deployments, AURA introduces a gamma-based risk scoring methodology that balances risk assessment accuracy with computational efficiency and practical considerations. AURA provides an interactive process to score, evaluate and mitigate the risks of running one or multiple AI Agents, synchronously or asynchronously (autonomously). The framework is engineered for Human-in-the-Loop (HITL) oversight and presents Agent-to-Human (A2H) communication mechanisms, allowing for seamless integration with agentic systems for autonomous self-assessment, rendering it interoperable with established protocols (MCP and A2A) and tools. AURA supports a responsible and transparent adoption of agentic AI and provides robust risk detection and mitigation while balancing computational resources, positioning it as a critical enabler for large-scale, governable agentic AI in enterprise environments.

machine learning, mitigation, natural language, (18 more...)

2510.15739

Country:

Europe > United Kingdom > England (0.28)
North America > United States (0.28)

Genre: Research Report (0.65)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Besrour, Ines, He, Jingbo, Schreieder, Tobias, Färber, Michael

SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation

We present SQuAI (https://squai.scads.ai/), a scalable and trustworthy multi-agent retrieval-augmented generation (RAG) framework for scientific question answering (QA) with large language models (LLMs). SQuAI addresses key limitations of existing RAG systems in the scholarly domain, where complex, open-domain questions demand accurate answers, explicit claims with citations, and retrieval across millions of scientific documents. Built on over 2.3 million full-text papers from arXiv.org, SQuAI employs four collaborative agents to decompose complex questions into sub-questions, retrieve targeted evidence via hybrid sparse-dense retrieval, and adaptively filter documents to improve contextual relevance. To ensure faithfulness and traceability, SQuAI integrates in-line citations for each generated claim and provides supporting sentences from the source documents. Our system improves faithfulness, answer relevance, and contextual relevance by up to +0.088 (12%) over a strong RAG baseline. We further release a benchmark of 1,000 scientific question-answer-evidence triplets to support reproducibility. With transparent reasoning, verifiable citations, and domain-wide scalability, SQuAI demonstrates how multi-agent RAG enables more trustworthy scientific QA with LLMs.

large language model, machine learning, natural language, (18 more...)

doi: 10.1145/3746252.3761471

2510.15682

Country:

Europe (1.00)
North America > United States (0.71)
Asia > Middle East > UAE (0.28)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Ortega, Tomas, Jafarkhani, Hamid

Decentralized Parameter-Free Online Learning

We propose the first parameter-free decentralized online learning algorithms with network regret guarantees, which achieve sublinear regret without requiring hyperparameter tuning. This family of algorithms connects multi-agent coin-betting and decentralized online learning via gossip steps. To enable our decentralized analysis, we introduce a novel "betting function" formulation for coin-betting that simplifies the multi-agent regret analysis. Our analysis shows sublinear network regret bounds and is validated through experiments on synthetic and real datasets. This family of algorithms is applicable to distributed sensing, decentralized optimization, and collaborative ML applications.

algorithm, artificial intelligence, machine learning, (16 more...)

2510.15644

Country: North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.91)

Build Your Personalized Research Group: A Multiagent Framework for Continual and Interactive Science Automation

Li, Ed, Ren, Junyu, Pan, Xintian, Yan, Cat, Li, Chuanhao, Bergemann, Dirk, Yang, Zhuoran

The automation of scientific discovery represents a critical milestone in Artificial Intelligence (AI) research. However, existing agentic systems for science suffer from two fundamental limitations: rigid, pre-programmed workflows that cannot adapt to intermediate findings, and inadequate context management that hinders long-horizon research. We present \texttt{freephdlabor}, an open-source multiagent framework featuring \textit{fully dynamic workflows} determined by real-time agent reasoning and a \coloremph{\textit{modular architecture}} enabling seamless customization -- users can modify, add, or remove agents to address domain-specific requirements. The framework provides comprehensive infrastructure including \textit{automatic context compaction}, \textit{workspace-based communication} to prevent information degradation, \textit{memory persistence} across sessions, and \textit{non-blocking human intervention} mechanisms. These features collectively transform automated research from isolated, single-run attempts into \textit{continual research programs} that build systematically on prior explorations and incorporate human feedback. By providing both the architectural principles and practical implementation for building customizable co-scientist systems, this work aims to facilitate broader adoption of automated research across scientific domains, enabling practitioners to deploy interactive multiagent systems that autonomously conduct end-to-end research -- from ideation through experimentation to publication-ready manuscripts.

agent, artificial intelligence, workflow, (16 more...)

2510.15624

Genre:

Workflow (1.00)
Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Distelzweig, Aron, Janjoš, Faris, Scheel, Oliver, Varra, Sirish Reddy, Rajan, Raghu, Boedecker, Joschka

Perfect Prediction or Plenty of Proposals? What Matters Most in Planning for Autonomous Driving

Abstract-- Traditionally, prediction and planning in autonomous driving (AD) have been treated as separate, sequential modules. Recently, there has been a growing shift towards tighter integration of these components, known as Integrated Prediction and Planning (IPP), with the aim of enabling more informed and adaptive decision-making. However, it remains unclear to what extent this integration actually improves planning performance. In this work, we investigate the role of prediction in IPP approaches, drawing on the widely adopted V al14 benchmark, which encompasses more common driving scenarios with relatively low interaction complexity, and the interPlan benchmark, which includes highly interactive and out-of-distribution driving situations. Our analysis reveals that even access to perfect future predictions does not lead to better planning outcomes, indicating that current IPP methods often fail to fully exploit future behavior information. Instead, we focus on high-quality proposal generation, while using predictions primarily for collision checks. We find that many imitation learning-based planners struggle to generate realistic and plausible proposals, performing worse than PDM--a simple lane-following approach. Motivated by this observation, we build on PDM with an enhanced proposal generation method, shifting the emphasis towards producing diverse but realistic and high-quality proposals. This proposal-centric approach significantly outperforms existing methods, especially in out-of-distribution and highly interactive settings, where it sets new state-of-the-art results.

artificial intelligence, machine learning, prediction, (16 more...)

2510.15505

Country: Europe > Germany (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)
Information Technology > Robotics & Automation (0.87)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)