location
Agents of Change: Self-Evolving LLM Agents for Strategic Planning
Belle, Nikolas, Barnes, Dakota, Amayuelas, Alfonso, Bercovich, Ivan, Wang, Xin Eric, Wang, William
We address the long-horizon gap in large language model (LLM) agents by enabling them to sustain coherent strategies in adversarial, stochastic environments. Settlers of Catan provides a challenging benchmark: success depends on balancing short- and long-term goals amid randomness, trading, expansion, and blocking. Prompt-centric LLM agents (e.g., ReAct, Reflexion) must re-interpret large, evolving game states each turn, quickly saturating context windows and losing strategic consistency. We propose HexMachina, a continual learning multi-agent system that separates environment discovery (inducing an adapter layer without documentation) from strategy improvement (evolving a compiled player through code refinement and simulation). This design preserves executable artifacts, allowing the LLM to focus on high-level strategy rather than per-turn reasoning. In controlled Catanatron experiments, HexMachina learns from scratch and evolves players that outperform the strongest human-crafted baseline (AlphaBeta), achieving a 54% win rate and surpassing prompt-driven and no-discovery baselines. Ablations confirm that isolating pure strategy learning improves performance. Overall, artifact-centric continual learning transforms LLMs from brittle stepwise deciders into stable strategy designers, advancing long-horizon autonomy.
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Asia > China (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Asia > Middle East > Israel (0.04)
- (4 more...)
Failure Makes the Agent Stronger: Enhancing Accuracy through Structured Reflection for Reliable Tool Interactions
Su, Junhao, Wan, Yuanliang, Yang, Junwei, Shi, Hengyu, Han, Tianyang, Luo, Junfeng, Qiu, Yurui
Tool-augmented large language models (LLMs) are usually trained with supervised imitation or coarse-grained reinforcement learning that optimizes single tool calls. Current self-reflection practices rely on heuristic prompts or one-way reasoning: the model is urged to 'think more' instead of learning error diagnosis and repair. This is fragile in multi-turn interactions; after a failure the model often repeats the same mistake. We propose structured reflection, which turns the path from error to repair into an explicit, controllable, and trainable action. The agent produces a short yet precise reflection: it diagnoses the failure using evidence from the previous step and then proposes a correct, executable follow-up call. For training we combine DAPO and GSPO objectives with a reward scheme tailored to tool use, optimizing the stepwise strategy Reflect, then Call, then Final. To evaluate, we introduce Tool-Reflection-Bench, a lightweight benchmark that programmatically checks structural validity, executability, parameter correctness, and result consistency. Tasks are built as mini trajectories of erroneous call, reflection, and corrected call, with disjoint train and test splits. Experiments on BFCL v3 and Tool-Reflection-Bench show large gains in multi-turn tool-call success and error recovery, and a reduction of redundant calls. These results indicate that making reflection explicit and optimizing it directly improves the reliability of tool interaction and offers a reproducible path for agents to learn from failure.
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- North America > United States > New York (0.04)
- Europe > Ukraine > Lviv Oblast > Lviv (0.04)
Finite-Sample Maximum Likelihood Estimation of Location
We consider 1-dimensional location estimation, where we estimate a parameter \lambda from n samples \lambda \eta_i, with each \eta_i drawn i.i.d. For fixed f the maximum-likelihood estimate (MLE) is well-known to be optimal in the limit as n \to \infty: it is asymptotically normal with variance matching the Cramer-Rao lower bound of \frac{1}{n\mathcal{I}}, where \mathcal{I} is the Fisher information of f . However, this bound does not hold for finite n, or when f varies with n . We show for arbitrary f and n that one can recover a similar theory based on the Fisher information of a smoothed version of f, where the smoothing radius decays with n .
EnigmaToM: Improve LLMs' Theory-of-Mind Reasoning Capabilities with Neural Knowledge Base of Entity States
Xu, Hainiu, Qi, Siya, Li, Jiazheng, Zhou, Yuxiang, Du, Jinhua, Catmur, Caroline, He, Yulan
Theory-of-Mind (ToM), the ability to infer others' perceptions and mental states, is fundamental to human interaction but remains a challenging task for Large Language Models (LLMs). While existing ToM reasoning methods show promise with reasoning via perceptual perspective-taking, they often rely excessively on LLMs, reducing their efficiency and limiting their applicability to high-order ToM reasoning, which requires multi-hop reasoning about characters' beliefs. To address these issues, we present EnigmaToM, a novel neuro-symbolic framework that enhances ToM reasoning by integrating a Neural Knowledge Base of entity states (Enigma) for (1) a psychology-inspired iterative masking mechanism that facilitates accurate perspective-taking and (2) knowledge injection that elicits key entity information. Enigma generates structured representations of entity states, which construct spatial scene graphs -- leveraging spatial information as an inductive bias -- for belief tracking of various ToM orders and enhancing events with fine-grained entity state details. Experimental results on multiple benchmarks, including ToMi, HiToM, and FANToM, show that EnigmaToM significantly improves ToM reasoning across LLMs of varying sizes, particularly excelling in high-order reasoning scenarios.
- Asia > Thailand (0.14)
- North America > United States (0.14)
- Europe > Middle East > Malta (0.14)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Echoes of Biases: How Stigmatizing Language Affects AI Performance
Liu, Yizhi, Wang, Weiguang, Gao, Guodong Gordon, Agarwal, Ritu
Electronic health records (EHRs) serve as an essential data source for the envisioned artificial intelligence (AI)-driven transformation in healthcare. However, clinician biases reflected in EHR notes can lead to AI models inheriting and amplifying these biases, perpetuating health disparities. This study investigates the impact of stigmatizing language (SL) in EHR notes on mortality prediction using a Transformer-based deep learning model and explainable AI (XAI) techniques. Our findings demonstrate that SL written by clinicians adversely affects AI performance, particularly so for black patients, highlighting SL as a source of racial disparity in AI model development. To explore an operationally efficient way to mitigate SL's impact, we investigate patterns in the generation of SL through a clinicians' collaborative network, identifying central clinicians as having a stronger impact on racial disparity in the AI model. We find that removing SL written by central clinicians is a more efficient bias reduction strategy than eliminating all SL in the entire corpus of data. This study provides actionable insights for responsible AI development and contributes to understanding clinician behavior and EHR note writing in healthcare.
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Netherlands > South Holland > Leiden (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Analysis of ChatGPT on Source Code
Sadik, Ahmed R., Ceravola, Antonello, Joublin, Frank, Patra, Jibesh
This paper explores the use of Large Language Models (LLMs) and in particular ChatGPT in programming, source code analysis, and code generation. LLMs and ChatGPT are built using machine learning and artificial intelligence techniques, and they offer several benefits to developers and programmers. While these models can save time and provide highly accurate results, they are not yet advanced enough to replace human programmers entirely. The paper investigates the potential applications of LLMs and ChatGPT in various areas, such as code creation, code documentation, bug detection, refactoring, and more. The paper also suggests that the usage of LLMs and ChatGPT is expected to increase in the future as they offer unparalleled benefits to the programming community.
- North America > United States > New York (0.04)
- Europe (0.04)
- Energy (1.00)
- Law (0.93)
- Information Technology > Security & Privacy (0.46)
Timekettle to showcase HybridComm Translation Technology
As the pioneer of translator earbuds, Timekettle has transformed the world of cross-language communication since its existence in 2016. HybridComm is what makes Timekettle competitive on the market. It subverts the shackles of conventional translation products by adopting an entirely different technical structure to advance our product in speech processing, simultaneous interpretation, and AI translation. "I believe the significance of a great translation product lies beyond the technological advancement of one company over another; it is the whole interaction experience between people of different cultural and language backgrounds that fames Timekettle product great." That's why Timekettle has invested massively to develop the world's first simultaneous translator earbuds WT2 Edge with the core technological advantage HybridComm technology, empowering cross-language communication with a complete hands-free, natural and fluent experience.
- North America > United States > Nevada > Clark County > Las Vegas (0.08)
- Asia > Myanmar > Mandalay Region > Mandalay (0.08)
35 Insurtech Companies Making Coverage Simpler
Regardless of where you live or who you are, protecting your home, assets and loved ones will always be a key concern. In a world full of uncertainty, people from all backgrounds want to be sure they're safeguarded from threats, potential disasters and loss of property. Enrolling in home, auto, health or other types of insurance can bring peace of mind, but as anyone who has tried to navigate the buying exchanges will know, this can often be easier said than done. The barriers to enrollment are many, and with so many complicated coverage options, convoluted eligibility requirements and fine print to sort through, the insurance industry is in need of a makeover. Thankfully, the insurtech industry has arisen to do just that.
- North America > United States > California > San Francisco County > San Francisco (0.20)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York > New York County > New York City (0.09)
- (14 more...)
Artificial Intelligence in Cars: Examples of AI in the Auto Industry
Artificial intelligence and self-driving cars are often complementary topics in technology. Simply put, you cannot really discuss one without the other. Though AI is being implemented at rapid speed in a variety of sectors, the way it's being used in the automotive industry is a hot-button issue. With every car manufacturer and their mother racing to develop artificial intelligence and self-driving technologies, there are also a slew of tech companies and startups with the same purpose. Though many believe personal, autonomous vehicles are the future, there are multiple ways in which AI and machine learning are being implemented in how vehicles are built and how they operate on the road.
- North America > United States > California > San Francisco County > San Francisco (0.16)
- North America > United States > Texas > Travis County > Austin (0.05)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.05)
- (10 more...)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology > Robotics & Automation (1.00)
- Automobiles & Trucks > Manufacturer (1.00)