new information
The Mirror Loop: Recursive Non-Convergence in Generative Reasoning Systems
Large language models are often described as capable of reflective reasoning, yet recursive self-evaluation without external feedback frequently yields reformulation rather than progress. We test this prediction in a cross-provider study of 144 reasoning sequences across three models (OpenAI GPT-4o-mini, Anthropic Claude 3 Haiku, and Google Gemini 2.0 Flash) and four task families (arithmetic, code, explanation, reflection), each iterated ten times under two conditions: ungrounded self-critique and a minimal grounding intervention (a single verification step at iteration three). Mean informational change (delta I, measured via normalized edit distance) declined by 55% from early (0.193) to late (0.087) iterations in ungrounded runs, with consistent patterns across all three providers. Grounded runs showed a +28% rebound in informational change immediately after the intervention and sustained non-zero variance thereafter. Complementary measures-n-gram novelty, embedding drift, and character-level entropy-converged on the same pattern: reflection without contact tends toward informational closure. We interpret this as evidence for a structural limit on self-correction in generative reasoning: without an exchange of information with an independent verifier or environment, recursive inference approaches an attractor state of epistemic stasis. Minimal grounding functions as dissipative coupling, reintroducing informational flux. The cross-architecture consistency suggests the mirror loop arises from shared autoregressive training objectives rather than provider-specific alignment schemes. The results delineate when reflection is performative rather than epistemic and motivate design principles for grounded, cooperative reasoning. Materials and code are publicly available.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Motion Planning Under Temporal Logic Specifications In Semantically Unknown Environments
Taheri, Azizollah, Aksaray, Derya
This paper addresses a motion planning problem to achieve spatio-temporal-logical tasks, expressed by syntactically co-safe linear temporal logic specifications (scLTL\next), in uncertain environments. Here, the uncertainty is modeled as some probabilistic knowledge on the semantic labels of the environment. For example, the task is "first go to region 1, then go to region 2"; however, the exact locations of regions 1 and 2 are not known a priori, instead a probabilistic belief is available. We propose a novel automata-theoretic approach, where a special product automaton is constructed to capture the uncertainty related to semantic labels, and a reward function is designed for each edge of this product automaton. The proposed algorithm utilizes value iteration for online replanning. We show some theoretical results and present some simulations/experiments to demonstrate the efficacy of the proposed approach.
- Asia > Middle East > Republic of Türkiye > Aksaray Province > Aksaray (0.05)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Are Large Reasoning Models Interruptible?
Wu, Tsung-Han, Miroyan, Mihran, Chan, David M., Darrell, Trevor, Norouzi, Narges, Gonzalez, Joseph E.
Large Reasoning Models (LRMs) excel at complex reasoning but are traditionally evaluated in static, "frozen world" settings: model responses are assumed to be instantaneous, and the context of a request is presumed to be immutable over the duration of the response. While generally true for short-term tasks, the "frozen world" assumption breaks down in modern reasoning tasks such as assistive programming, where models may take hours to think through problems and code may change dramatically from the time the model starts thinking to the model's final output. In this work, we challenge the frozen world assumption and evaluate LRM robustness under two realistic dynamic scenarios: interruptions, which test the quality of the model's partial outputs on a limited budget, and dynamic context, which tests model adaptation to in-flight changes. Across mathematics and programming benchmarks that require long-form reasoning, static evaluations consistently overestimate robustness: even state-of-the-art LRMs, which achieve high accuracy in static settings, can fail unpredictably when interrupted or exposed to changing context, with performance dropping by up to 60% when updates are introduced late in the reasoning process. Our analysis further reveals several novel failure modes, including reasoning leakage, where models fold the reasoning into their final answer when interrupted; panic, where under time pressure models abandon reasoning entirely and return incorrect answers; and self-doubt, where performance degrades while incorporating updated information. Project Page: http://dynamic-lm.github.io/
GRU-ODE and GRU-Bayes have complementary
We thank reviewers for the relevant comments. We first address general questions and then give brief individual answers. Those projected distributions vary smoothly as they are driven by an ODE. Continuous-time Bayesian networks (Nodelman et al., UAI 2002) address a This joint modeling of continuous measurements and events was left for future work. Some assumptions have to be made about the conditional distribution of the observations.
Assessing Large Language Models in Updating Their Forecasts with New Information
Yuan, Zhangdie, Ding, Zifeng, Vlachos, Andreas
Prior work has largely treated future event prediction as a static task, failing to consider how forecasts and the confidence in them should evolve as new evidence emerges. To address this gap, we introduce EVOLVECAST, a framework for evaluating whether large language models appropriately revise their predictions in response to new information. In particular, EVOLVECAST assesses whether LLMs adjust their forecasts when presented with information released after their training cutoff. We use human forecasters as a comparative reference to analyze prediction shifts and confidence calibration under updated contexts. While LLMs demonstrate some responsiveness to new information, their updates are often inconsistent or overly conservative. We further find that neither verbalized nor logits-based confidence estimates consistently outperform the other, and both remain far from the human reference standard. Across settings, models tend to express conservative bias, underscoring the need for more robust approaches to belief updating.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- South America > Colombia > Meta Department > Villavicencio (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- (3 more...)
Capturing Opinion Shifts in Deliberative Discourse through Frequency-based Quantum deep learning methods
Thakur, Rakesh, Chaturvedi, Harsh, Shah, Ruqayya, Chauhan, Janvi, Sharma, Ayush
Deliberation plays a crucial role in shaping outcomes by weighing diverse perspectives before reaching decisions. With recent advancements in Natural Language Processing, it has become possible to computationally model deliberation by analyzing opinion shifts and predicting potential outcomes under varying scenarios. In this study, we present a comparative analysis of multiple NLP techniques to evaluate how effectively models interpret deliberative discourse and produce meaningful insights. Opinions from individuals of varied backgrounds were collected to construct a self-sourced dataset that reflects diverse viewpoints. Deliberation was simulated using product presentations enriched with striking facts, which often prompted measurable shifts in audience opinions. We have given comparative analysis between two models namely Frequency-Based Discourse Modulation and Quantum-Deliberation Framework which outperform the existing state of art models. Deliberation is the structured process of reasoning, dialogue, and weighing evidence before decisions are made. Unlike ordinary conversation, it emphasizes logical argumentation, inclusivity, and critical reflection.
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
- Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.49)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Pre-Storage Reasoning for Episodic Memory: Shifting Inference Burden to Memory for Personalized Dialogue
Kim, Sangyeop, Lee, Yohan, Kim, Sanghwa, Kim, Hyunjong, Cho, Sungzoon
Effective long-term memory in conversational AI requires synthesizing information across multiple sessions. However, current systems place excessive reasoning burden on response generation, making performance significantly dependent on model sizes. We introduce PREMem (Pre-storage Reasoning for Episodic Memory), a novel approach that shifts complex reasoning processes from inference to memory construction. PREMem extracts fine-grained memory fragments categorized into factual, experiential, and subjective information; it then establishes explicit relationships between memory items across sessions, capturing evolution patterns like extensions, transformations, and implications. By performing this reasoning during pre-storage rather than when generating a response, PREMem creates enriched representations while reducing computational demands during interactions. Experiments show significant performance improvements across all model sizes, with smaller models achieving results comparable to much larger baselines while maintaining effectiveness even with constrained token budgets. Code and dataset are available at https://github.com/sangyeop-kim/PREMem.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Middle East > Jordan (0.05)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (8 more...)
- Research Report > New Finding (0.67)
- Research Report > Promising Solution (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.88)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.84)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Consensus in Motion: A Case of Dynamic Rationality of Sequential Learning in Probability Aggregation
Gordienko, Polina, Jansen, Christoph, Augustin, Thomas, Rechenauer, Martin
We propose a framework for probability aggregation based on propositional probability logic. Unlike conventional judgment aggregation, which focuses on static rationality, our model addresses dynamic rationality by ensuring that collective beliefs update consistently with new information. We show that any consensus-compatible and independent aggregation rule on a non-nested agenda is necessarily linear. Furthermore, we provide sufficient conditions for a fair learning process, where individuals initially agree on a specified subset of propositions known as the common ground, and new information is restricted to this shared foundation. This guarantees that updating individual judgments via Bayesian conditioning--whether performed before or after aggregation--yields the same collective belief. A distinctive feature of our framework is its treatment of sequential decision-making, which allows new information to be incorporated progressively through multiple stages while maintaining the established common ground. We illustrate our findings with a running example in a political scenario concerning healthcare and immigration policies.
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Germany > Saxony > Leipzig (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
The Roots of International Perceptions: Simulating US Attitude Changes Towards China with LLM Agents
Sukiennik, Nicholas, Xu, Yichuan, Kan, Yuqing, Piao, Jinghua, Yan, Yuwei, Gao, Chen, Li, Yong
The rise of LLMs poses new possibilities in modeling opinion evolution, a long-standing task in simulation, by leveraging advanced reasoning abilities to recreate complex, large-scale human cognitive trends. While most prior works focus on opinion evolution surrounding specific isolated events or the views within a country, ours is the first to model the large-scale attitude evolution of a population representing an entire country towards another - US citizens' perspectives towards China. To tackle the challenges of this broad scenario, we propose a framework that integrates media data collection, user profile creation, and cognitive architecture for opinion updates to successfully reproduce the real trend of US attitudes towards China over a 20-year period from 2005 to today. We also leverage LLMs' capabilities to introduce de-biased media exposure, extracting neutral events from typically subjective news contents, to uncover the roots of polarized opinion formation, as well as a devils advocate agent to help explain the rare reversal from negative to positive attitudes towards China, corresponding with changes in the way Americans obtain information about the country. The simulation results, beyond validating our framework architecture, also reveal the impact of biased framing and selection bias in shaping attitudes. Overall, our work contributes to a new paradigm for LLM-based modeling of cognitive behaviors in a large-scale, long-term, cross-border social context, providing insights into the formation of international biases and offering valuable implications for media consumers to better understand the factors shaping their perspectives, and ultimately contributing to the larger social need for bias reduction and cross-cultural tolerance.
- North America > United States (0.46)
- South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
- Europe > United Kingdom (0.04)
- (4 more...)
- Media > News (1.00)
- Government (1.00)
- Banking & Finance (0.93)