AITopics

Large Language Models (LLMs) can serve as world models to enhance agent decision-making in digital environments by simulating future states and predicting action outcomes, potentially eliminating costly trial-and-error exploration. However, this capability is fundamentally limited by LLM's tendency to hallucination and their reliance on static training knowledge, which could lead to compounding errors that inhibit long-horizon simulations. To systematically investigate whether LLMs are appropriate for world modeling, we probe two core capabilities of world models - future state prediction and reward estimation - through three tasks: next-state identification, full-procedure planning alignment, and milestone transition recognition. Our analysis shows that while LLMs effectively capture immediate next states and identify meaningful state transitions, their performance rapidly degrades in full-procedure planning. This highlights LLMs' limitations in reliably modeling environment dynamics over long horizons. To address these limitations, we propose the Retrieval-augmented World Model (R-WoM), which grounds LLM simulations by incorporating factual, up-to-date knowledge retrieved from external tutorials. Experiments show that R-WoM achieves substantial improvements of up to 25.3% (OSWorld) and 18.1% (WebArena) compared to baselines, with particular advantage in longer-horizon simulations. World models have evolved from early symbolic planning systems to sophisticated neural architectures that learn latent representations of environment dynamics. Model-based reinforcement learning (MBRL) approaches, such as Dreamer v1-3 (Hafner et al., 2019; 2020; 2023) and MuZero (Schrittwieser et al., 2020), learn latent world models to "imagine" trajectories before selecting actions. More recently, Large Language Model (LLM)-based world models (Hao et al., 2023; Wang et al., 2024; Zhang et al., 2024) have emerged as a new paradigm, leveraging large-scale pre-training to reason about action consequences in realistic digital environments.

artificial intelligence, large language model, natural language, (17 more...)

2510.11892

Genre:

Research Report (1.00)
Workflow (0.68)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving

Wang, Ruida, Yao, Jiarui, Pan, Rui, Diao, Shizhe, Zhang, Tong

Solving math problems through verifiable languages such as Lean has significantly impacted both the mathematics and computer science communities. Current state-of-the-art models are often trained with expensive online Reinforcement Learning (RL) or expert iteration. However, these approaches rely on fixed problem sets, which causes inefficient training and limits the model to tackle complex problems. To overcome these limitations, we propose GAR: Generative Adversarial Reinforcement learning, a comprehensive RL training framework that jointly trains the problem composer and solver in an adversarial loop. GAR introduces an implicit curriculum learning mechanism, which aligns task difficulty with the prover's evolving capability. It thereby improves the training efficiency and enables stronger performance of proving advanced theorems. Experiments show that with GAR training, Goedel-Prover-V2-8B and DeepSeek-Prover-V2-7B achieve an average relative improvement in pass@32 of 4.20% on MiniF2F-Test benchmark, while DeepSeek-Prover-V2's pass@32 on ProofNet-Test increases from 22.58% to 25.81%. Beyond formal proving, GAR establishes a general RL paradigm for co-evolution of problem generation and solving under verifiable environments.

large language model, machine learning, polynomial, (20 more...)

2510.11769

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.34)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
(3 more...)

Artificial Intelligence for Optimal Learning: A Comparative Approach towards AI-Enhanced Learning Environments

Hariharan, Ananth

In the rapidly evolving educational landscape, the integration of technology has shifted from an enhancement to a cornerstone of educational strategy worldwide. This transition is propelled by advancements in digital technology, especially the emergence of artificial intelligence as a crucial tool in learning environments. This research project critically evaluates the impact of three distinct educational settings: traditional educational methods without technological integration, those enhanced by non-AI technology, and those utilising AI-driven technologies. This comparison aims to assess how each environment influences educational outcomes, engagement, pedagogical methods, and equity in access to learning resources, and how each contributes uniquely to the learning experience. The ultimate goal of this research is to synthesise the strengths of each model to create a more holistic educational approach. By integrating the personal interaction and tested pedagogical techniques of traditional classrooms, the enhanced accessibility and collaborative tools offered by non-AI technology, and the personalised, adaptive learning strategies enabled by AI-driven technologies, education systems can develop richer, more effective learning environments. This hybrid approach aims to leverage the best elements of each setting, thereby enhancing educational outcomes, engagement, and inclusiveness, while also addressing the distinct challenges and limitations inherent in each model. The intention is to create an educational framework deeply attentive to the diverse needs of students, ensuring equitable access to high-quality education for all.

artificial intelligence, machine learning, natural language, (18 more...)

2510.11755

Genre:

Research Report (1.00)
Instructional Material (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (0.93)
Education > Educational Technology > Educational Software > Computer Based Training (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.94)

Li, Shangzhe, Zhou, Dongruo, Zhang, Weitong

Near-Optimal Second-Order Guarantees for Model-Based Adversarial Imitation Learning

We study online adversarial imitation learning (AIL), where an agent learns from offline expert demonstrations and interacts with the environment online without access to rewards. Despite strong empirical results, the benefits of online interaction and the impact of stochasticity remain poorly understood. We address these gaps by introducing a model-based AIL algorithm (MB-AIL) and establish its horizon-free, second-order sample-complexity guarantees under general function approximations for both expert data and reward-free interactions. These second-order bounds provide an instance-dependent result that can scale with the variance of returns under the relevant policies and therefore tighten as the system approaches determinism. Together with second-order, information-theoretic lower bounds on a newly constructed hard-instance family, we show that MB-AIL attains minimax-optimal sample complexity for online interaction (up to logarithmic factors) with limited expert demonstrations and matches the lower bound for expert demonstrations in terms of the dependence on horizon $H$, precision $ε$ and the policy variance $σ^2$. Experiments further validate our theoretical findings and demonstrate that a practical implementation of MB-AIL matches or surpasses the sample efficiency of existing methods.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2510.09487

Country: North America > United States (0.92)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Learning Diffusion Models with Flexible Representation Guidance

Wang, Chenyu, Zhou, Cai, Gupta, Sharut, Lin, Zongyu, Jegelka, Stefanie, Bates, Stephen, Jaakkola, Tommi

Diffusion models can be improved with additional guidance towards more effective representations of input. Indeed, prior empirical work has already shown that aligning internal representations of the diffusion model with those of pre-trained models improves generation quality. In this paper, we present a systematic framework for incorporating representation guidance into diffusion models. We provide alternative decompositions of denoising models along with their associated training criteria, where the decompositions determine when and how the auxiliary representations are incorporated. Guided by our theoretical insights, we introduce two new strategies for enhancing representation alignment in diffusion models. First, we pair examples with target representations either derived from themselves or arisen from different synthetic modalities, and subsequently learn a joint model over the multimodal pairs. Second, we design an optimal training curriculum that balances representation learning and data generation. Our experiments across image, protein sequence, and molecule generation tasks demonstrate superior performance as well as accelerated training. In particular, on the class-conditional ImageNet $256\times 256$ benchmark, our guidance results in $23.3$ times faster training than the original SiT-XL as well as four times speedup over the state-of-the-art method REPA. The code is available at https://github.com/ChenyuWang-Monica/REED.

large language model, machine learning, natural language, (17 more...)

2507.0898

Genre:

Instructional Material > Course Syllabus & Notes (0.34)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

A Community-driven vision for a new Knowledge Resource for AI

Chaudhri, Vinay K, Baru, Chaitan, Bennett, Brandon, Bhatt, Mehul, Cassel, Darion, Cohn, Anthony G, Dechter, Rina, Erdem, Esra, Ferrucci, Dave, Forbus, Ken, Gelfond, Gregory, Genesereth, Michael, Gordon, Andrew S., Grosof, Benjamin, Gupta, Gopal, Hendler, Jim, Israni, Sharat, Josephson, Tyler R., Kyllonen, Patrick, Lierler, Yuliya, Lifschitz, Vladimir, McFate, Clifton, McGinty, Hande K., Morgenstern, Leora, Oltramari, Alessandro, Paritosh, Praveen, Roth, Dan, Shepard, Blake, Shimzu, Cogan, Vrandečić, Denny, Whiting, Mark, Witbrock, Michael

The Cyc project, started in 1984, created the first large-scale database of commonsense knowledge. The initiative continues to this day with its aim to provide a comprehensive ontology and knowledge base of commonsense knowledge to enable human-like reasoning for AI systems. In the concluding paragraph of his Communications of the Association of Computing Machinery (CACM) 1995 article A Large-Scale Investment in Knowledge Infrastructure [52], Cyc's founder Douglas B. Lenat wrote: Is Cyc necessary? How far would a user get with something simpler than Cyc but that lacks everyday commonsense knowledge? Nobody knows; the question will be settled empirically. Our guess is most of these applications will eventually tap the synergy in a suite of sources (including neural nets and decision theory), one of which will be Cyc. Although 30 years have passed since the above article was written, AI research community has not conclusively settled [10] the question "How far would a user get with something simpler than Cyc but that lacks everyday commonsense knowledge?" However, it is clear that significant strides have been made in addressing many of the tasks that were original Cyc use cases, including information retrieval, semi-automatically linking multiple heterogeneous external information sources, spelling and grammar correction, machine translation, natural language understanding and speech understanding.

artificial intelligence, knowledge, machine learning, (18 more...)

2506.16596

Country:

North America > United States > California (0.28)
North America > United States > Texas (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Health & Medicine (1.00)
Education > Educational Setting (0.93)
Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Mendes, Paulo R. C., Vieira, Eduardo S., Guedes, Álan L. V., Busson, Antonio J. G., Colcher, Sérgio

A Clustering-Based Method for Automatic Educational Video Recommendation Using Deep Face-Features of Lecturers

Discovering and accessing specific content within educational video bases is a challenging task, mainly because of the abundance of video content and its diversity. Recommender systems are often used to enhance the ability to find and select content. But, recommendation mechanisms, especially those based on textual information, exhibit some limitations, such as being error-prone to manually created keywords or due to imprecise speech recognition. This paper presents a method for generating educational video recommendation using deep face-features of lecturers without identifying them. More precisely, we use an unsupervised face clustering mechanism to create relations among the videos based on the lecturer's presence. Then, for a selected educational video taken as a reference, we recommend the ones where the presence of the same lecturers is detected. Moreover, we rank these recommended videos based on the amount of time the referenced lecturers were present. For this task, we achieved a mAP value of 99.165%.

artificial intelligence, machine learning, natural language, (17 more...)

doi: 10.1109/ISM.2020.00034

2010.04676

Country: South America > Brazil (0.28)

Genre: Instructional Material (0.93)

Industry:

Education > Educational Technology > Audio & Video (1.00)
Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.93)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.93)
(4 more...)

Schwarzentruber, François

Lecture Notes on Verifying Graph Neural Networks

In these lecture notes, we first recall the connection between graph neural networks, Weisfeiler-Lehman tests and logics such as first-order logic and graded modal logic. We then present a modal logic in which counting modalities appear in linear inequalities in order to solve verification tasks on graph neural networks. We describe an algorithm for the satisfiability problem of that logic. It is inspired from the tableau method of vanilla modal logic, extended with reasoning in quantifier-free fragment Boolean algebra with Presburger arithmetic.

artificial intelligence, logic & formal reasoning, machine learning, (18 more...)

2510.11617

Country:

Europe (1.00)
North America > United States (0.67)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

IntersectioNDE: Learning Complex Urban Traffic Dynamics based on Interaction Decoupling Strategy

Lin, Enli, Yang, Ziyuan, Lu, Qiujing, Hu, Jianming, Feng, Shuo

Realistic traffic simulation is critical for ensuring the safety and reliability of autonomous vehicles (AVs), especially in complex and diverse urban traffic environments. However, existing data-driven simulators face two key challenges: a limited focus on modeling dense, heterogeneous interactions at urban intersections - which are prevalent, crucial, and practically significant in countries like China, featuring diverse agents including motorized vehicles (MVs), non-motorized vehicles (NMVs), and pedestrians - and the inherent difficulty in robustly learning high-dimensional joint distributions for such high-density scenes, often leading to mode collapse and long-term simulation instability. We introduce City Crossings Dataset (CiCross), a large-scale dataset collected from a real-world urban intersection, uniquely capturing dense, heterogeneous multi-agent interactions, particularly with a substantial proportion of MVs, NMVs and pedestrians. Based on this dataset, we propose IntersectioNDE (Intersection Naturalistic Driving Environment), a data-driven simulator tailored for complex urban intersection scenarios. Its core component is the Interaction Decoupling Strategy (IDS), a training paradigm that learns compositional dynamics from agent subsets, enabling the marginal-to-joint simulation. Integrated into a scene-aware Transformer network with specialized training techniques, IDS significantly enhances simulation robustness and long-term stability for modeling heterogeneous interactions. Experiments on CiCross show that IntersectioNDE outperforms baseline methods in simulation fidelity, stability, and its ability to replicate complex, distribution-level urban traffic dynamics.

artificial intelligence, machine learning, simulation, (15 more...)

2510.11534

Country: Asia > China (0.36)

Genre:

Research Report (0.50)
Instructional Material (0.34)

Industry: Transportation > Ground > Road (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.90)

Extended Triangular Method: A Generalized Algorithm for Contradiction Separation Based Automated Deduction

Xu, Yang, Chen, Shuwei, Liu, Jun, Cao, Feng, He, Xingxing

Automated deduction lies at the core of Artificial Intelligence (AI), underpinning theorem proving, formal verification, and logical reasoning. Despite decades of progress, reconciling deductive completeness with computational efficiency remains an enduring challenge. Traditional reasoning calculi, grounded in binary resolution, restrict inference to pairwise clause interactions and thereby limit deductive synergy among multiple clauses. The Contradiction Separation Extension (CSE) framework, introduced in 2018, proposed a dynamic multi-clause reasoning theory that redefined logical inference as a process of contradiction separation rather than sequential resolution. While that work established the theoretical foundation, its algorithmic realization remained unformalized and unpublished. This work presents the Extended Triangular Method (ETM), a generalized contradiction-construction algorithm that formalizes and extends the internal mechanisms of contradiction separation. The ETM unifies multiple contradiction-building strategies, including the earlier Standard Extension method, within a triangular geometric framework that supports flexible clause interaction and dynamic synergy. ETM serves as the algorithmic core of several high-performance theorem provers, CSE, CSE-E, CSI-E, and CSI-Enig, whose competitive results in standard first-order benchmarks (TPTP problem sets and CASC 2018-2015) empirically validate the effectiveness and generality of the proposed approach. By bridging theoretical abstraction and operational implementation, ETM advances the contradiction separation paradigm into a generalized, scalable, and practically competitive model for automated reasoning, offering new directions for future research in logical inference and theorem proving.

artificial intelligence, deduction, logic & formal reasoning, (16 more...)

2510.10701

Country: Europe > United Kingdom (0.28)

Genre:

Research Report (0.81)
Instructional Material > Course Syllabus & Notes (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)