AITopics

Optimization Modeling (OM) is essential for solving complex decision-making problems. However, the process remains time-consuming and error-prone, heavily relying on domain experts. While Large Language Models (LLMs) show promise in addressing these challenges through their natural language understanding and reasoning capabilities, current approaches face three critical limitations: high benchmark labeling error rates reaching up to 42%, narrow evaluation scope that only considers optimal values, and computational inefficiency due to heavy reliance on multi-agent systems or model fine-tuning. In this work, we first enhance existing datasets through systematic error correction and more comprehensive annotation. Additionally, we introduce LogiOR, a new optimization modeling benchmark from the logistics domain, containing more complex problems with standardized annotations. Furthermore, we present ORThought, a novel framework that leverages expert-level optimization modeling principles through chain-of-thought reasoning to automate the OM process. Through extensive empirical evaluation, we demonstrate that ORThought outperforms existing approaches, including multi-agent frameworks, with particularly significant advantages on complex optimization problems. Finally, we provide a systematic analysis of our method, identifying critical success factors and failure modes, providing valuable insights for future research on LLM-based optimization modeling.

large language model, machine learning, natural language, (19 more...)

2508.1441

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Freight & Logistics Services (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Gonnermann-Müller, Jana, Haase, Jennifer, Fackeldey, Konstantin, Pokutta, Sebastian

FACET: Teacher-Centred LLM-Based Multi-Agent Systems-Towards Personalized Educational Worksheets

The increasing heterogeneity of student populations poses significant challenges for teachers, particularly in mathematics education, where cognitive, motivational, and emotional differences strongly influence learning outcomes. While AI-driven personalization tools have emerged, most remain performance-focused, offering limited support for teachers and neglecting broader pedagogical needs. This paper presents the FACET framework, a teacher-facing, large language model (LLM)-based multi-agent system designed to generate individualized classroom materials that integrate both cognitive and motivational dimensions of learner profiles. The framework comprises three specialized agents: (1) learner agents that simulate diverse profiles incorporating topic proficiency and intrinsic motivation, (2) a teacher agent that adapts instructional content according to didactical principles, and (3) an evaluator agent that provides automated quality assurance. We tested the system using authentic grade 8 mathematics curriculum content and evaluated its feasibility through a) automated agent-based assessment of output quality and b) exploratory feedback from K-12 in-service teachers. Results from ten internal evaluations highlighted high stability and alignment between generated materials and learner profiles, and teacher feedback particularly highlighted structure and suitability of tasks. The findings demonstrate the potential of multi-agent LLM architectures to provide scalable, context-aware personalization in heterogeneous classroom settings, and outline directions for extending the framework to richer learner profiles and real-world classroom trials.

artificial intelligence, learner, worksheet, (17 more...)

2508.11401

Country:

Asia (0.93)
Europe > Germany (0.69)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Education > Educational Setting (1.00)
Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.94)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

A Compositional Framework for On-the-Fly LTLf Synthesis

Li, Yongkang, Xiao, Shengping, Zhu, Shufang, Li, Jianwen, Pu, Geguang

Reactive synthesis from Linear Temporal Logic over finite traces (LTLf) can be reduced to a two-player game over a Deterministic Finite Automaton (DFA) of the LTLf specification. The primary challenge here is DFA construction, which is 2EXPTIME-complete in the worst case. Existing techniques either construct the DFA compositionally before solving the game, leveraging automata minimization to mitigate state-space explosion, or build the DFA incrementally during game solving to avoid full DFA construction. However, neither is dominant. In this paper, we introduce a compositional on-the-fly synthesis framework that integrates the strengths of both approaches, focusing on large conjunctions of smaller LTLf formulas common in practice. This framework applies composition during game solving instead of automata (game arena) construction. While composing all intermediate results may be necessary in the worst case, pruning these results simplifies subsequent compositions and enables early detection of unrealizability. Specifically, the framework allows two composition variants: pruning before composition to take full advantage of minimization or pruning during composition to guide on-the-fly synthesis. Compared to state-of-the-art synthesis solvers, our framework is able to solve a notable number of instances that other solvers cannot handle. A detailed analysis shows that both composition variants have unique merits.

artificial intelligence, logic & formal reasoning, specification, (17 more...)

2508.04116

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.68)

Winiarski, Tomasz, Kaniuka, Jan, Giełdowski, Daniel, Ostrysz, Jakub, Radlak, Krystian, Kushnir, Dmytro

ROS-related Robotic Systems Development with V-model-based Application of MeROS Metamodel

Systems built on the Robot Operating System (ROS) are increasingly easy to assemble, yet hard to govern and reliably coordinate. Beyond the sheer number of subsystems involved, the difficulty stems from their diversity and interaction depth. In this paper, we use a compact heterogeneous robotic system (HeROS), combining mobile and manipulation capabilities, as a demonstration vehicle under dynamically changing tasks. Notably, all its subsystems are powered by ROS. The use of compatible interfaces and other ROS integration capabilities simplifies the construction of such systems. However, this only addresses part of the complexity: the semantic coherence and structural traceability are even more important for precise coordination and call for deliberate engineering methods. The Model-Based Systems Engineering (MBSE) discipline, which emerged from the experience of complexity management in large-scale engineering domains, offers the methodological foundations needed. Despite their strengths in complementary aspects of robotics systems engineering, the lack of a unified approach to integrate ROS and MBSE hinders the full potential of these tools. Motivated by the anticipated impact of such a synergy in robotics practice, we propose a structured methodology based on MeROS - a SysML metamodel created specifically to put the ROS-based systems into the focus of the MBSE workflow. As its methodological backbone, we adapt the well-known V-model to this context, illustrating how complex robotic systems can be designed with traceability and validation capabilities embedded into their lifecycle using practices familiar to engineering teams.

artificial intelligence, diagram, mobile robot, (16 more...)

2506.08706

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)
North America > Canada (0.67)

Genre:

Research Report (0.64)
Workflow (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Empirical Validation of the Independent Chip Model

Kim, Juho

The independent chip model (ICM) forms a cornerstone of all modern poker tournament strategy. However, despite its prominence, the ICM's performance in the real world has not been sufficiently scrutinized, especially at a large scale. In this paper, we introduce our new dataset of poker tournaments, consisting of results of over ten thousand events. Then, using this dataset, we perform two experiments as part of a large-scale empirical validation of the ICM. First, we verify that the ICM performs more accurately than a baseline we propose. Second, we obtain empirical evidence of the ICM underestimating the performances of players with larger stacks while overestimating those who are short-stacked. Our contributions may be useful to future researchers developing new algorithms for estimating a player's value in poker tournaments.

artificial intelligence, icm, tournament, (15 more...)

doi: 10.1109/CoG64752.2025.11114139

2506.0018

Country: North America > United States (0.69)

Genre: Research Report > Experimental Study (0.70)

Industry: Leisure & Entertainment > Games > Poker (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

MIT Technology ReviewAug-22-2025, 11:00:00 GMT

Meet the researcher hosting a scientific conference by and for AI

That idea is not without its detractors. Among other issues, many feel AI is not capable of the creative thought needed in research, makes too many mistakes and hallucinations, and may limit opportunities for young researchers. Nevertheless, a number of scientists and policymakers are very keen on the promise of AI scientists. The US government's AI Action Plan describes the need to "invest in automated cloud-enabled labs for a range of scientific fields." Some researchers think AI scientists could unlock scientific discoveries that humans could never find alone.

artificial intelligence, scientific conference, scientist, (5 more...)

MIT Technology Review

Industry: Government (0.59)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.55)

Neural Information Processing SystemsAug-22-2025, 02:57:46 GMT

b6676756f8a935e208f394a1ba47f0bc-Paper-Datasets_and_Benchmarks_Track.pdf

cloud computing, machine learning, reinforcement learning, (18 more...)

Country: North America > United States > New York (0.15)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology > Services (1.00)
Energy > Energy Storage (1.00)
Energy > Oil & Gas > Upstream (0.45)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Alexander Peysakhovich, Christian Kroer, Adam Lerer

Robust Multi-agent Counterfactual Prediction

Neural Information Processing SystemsAug-22-2025, 02:27:28 GMT

They make counterfactual predictions by using observed actions to learn the underlying utility function (a.k.a.

artificial intelligence, machine learning, revelation game, (19 more...)

Country: North America > Canada (0.04)

Industry:

Leisure & Entertainment > Games (0.68)
Education (0.47)
Marketing (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsAug-22-2025, 01:32:56 GMT

c8f9db5b83fac60ca3c6d6d06a9adcd6-Paper-Conference.pdf

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsAug-22-2025, 01:31:57 GMT

A Stochastic Linearized Augmented Lagrangian Method for Decentralized Bilevel Optimization Songtao Lu

Classical bilevel optimization is referred to as the case where there is no consensus constraint but with only two levels of the minimization subproblems, i.e.,

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > Ohio > Franklin County > Columbus (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)