Goto

Collaborating Authors

 reflection


Pentagon says US military to be an 'AI-first' fighting force

BBC News

Pentagon says US military to be an'AI-first' fighting force The US military plans to increase its use of artificial intelligence (AI) further after the Pentagon agreed to new and expanded contracts with some of the biggest names in technology. Under eight agreements with Google, OpenAI, Amazon, Microsoft, SpaceX, Oracle, Nvidia and the start-up Reflection, the Pentagon said AI technology would now be used for any lawful operational use. These agreements accelerate the transformation [of] the US military as an AI-first fighting force, the Pentagon said. Conspicuous by its absence is Anthropic, as the company has said it is concerned about how the Pentagon could use its tools in warfare and domestically. The firm is now suing the government over the alleged retaliation it faced after refusing to accept any lawful use language in its own contract.



Grok tells researchers pretending to be delusional 'drive an iron nail through the mirror while reciting Psalm 91 backwards'

The Guardian

Researchers found X's AI assistant Grok 4 .1 was'the model most willing to operationalise a delusion, providing detailed real-world guidance'. Researchers found X's AI assistant Grok 4 .1 was'the model most willing to operationalise a delusion, providing detailed real-world guidance'. Grok tells researchers pretending to be delusional'drive an iron nail through the mirror while reciting Psalm 91 backwards' Elon Musk's AI chatbot'extremely validating' of delusional inputs and often went further, 'elaborating new material', study finds Elon Musk's AI chatbot Grok 4.1 told researchers pretending to be delusional that there was indeed a doppelganger in their mirror and they should drive an iron nail through the glass while reciting Psalm 91 backwards. Researchers at the City University of New York (Cuny) and King's College London have published a paper on how various chatbots protect - or fail to safeguard - users' mental health. Experts are increasingly warning that psychosis or mania can be fuelled by AI chatbots.


Generative AI improves a wireless vision system that sees through obstructions

Robohub

MIT researchers have spent more than a decade studying techniques that enable robots to find and manipulate hidden objects by "seeing" through obstacles. Their methods utilize surface-penetrating wireless signals that reflect off concealed items. Now, the researchers are leveraging generative artificial intelligence models to overcome a longstanding bottleneck that limited the precision of prior approaches. The result is a new method that produces more accurate shape reconstructions, which could improve a robot's ability to reliably grasp and manipulate objects that are blocked from view. This new technique builds a partial reconstruction of a hidden object from reflected wireless signals and fills in the missing parts of its shape using a specially trained generative AI model.


Breaking Data Symmetry is Needed For Generalization in Feature Learning Kernels

arXiv.org Machine Learning

Grokking occurs when a model achieves high training accuracy but generalization to unseen test points happens long after that. This phenomenon was initially observed on a class of algebraic problems, such as learning modular arithmetic (Power et al., 2022). We study grokking on algebraic tasks in a class of feature learning kernels via the Recursive Feature Machine (RFM) algorithm (Radhakrishnan et al., 2024), which iteratively updates feature matrices through the Average Gradient Outer Product (AGOP) of an estimator in order to learn task-relevant features. Our main experimental finding is that generalization occurs only when a certain symmetry in the training set is broken. Furthermore, we empirically show that RFM generalizes by recovering the underlying invariance group action inherent in the data. We find that the learned feature matrices encode specific elements of the invariance group, explaining the dependence of generalization on symmetry.


Reflective Multi-Agent Collaboration based on Large Language Models

Neural Information Processing Systems

Benefiting from the powerful language expression and planning capabilities of Large Language Models (LLMs), LLM-based autonomous agents have achieved promising performance in various downstream tasks. Recently, based on the development of single-agent systems, researchers propose to construct LLM-based multi-agent systems to tackle more complicated tasks. In this paper, we propose a novel framework, named COPPER, to enhance the collaborative capabilities of LLM-based agents with the self-reflection mechanism. To improve the quality of reflections, we propose to fine-tune a shared reflector, which automatically tunes the prompts of actor models using our counterfactual PPO mechanism. On the one hand, we propose counterfactual rewards to assess the contribution of a single agent's reflection within the system, alleviating the credit assignment problem. On the other hand, we propose to train a shared reflector, which enables the reflector to generate personalized reflections according to agent roles, while reducing the computational resource requirements and improving training stability. We conduct experiments on three datasets to evaluate the performance of our model in multi-hop question answering, mathematics, and chess scenarios. Experimental results show that COPPER possesses stronger reflection capabilities and exhibits excellent generalization performance across different actor models.


RobIR: Robust Inverse Rendering for High-Illumination Scenes

Neural Information Processing Systems

Implicit representation has opened up new possibilities for inverse rendering. However, existing implicit neural inverse rendering methods struggle to handle strongly illuminated scenes with significant shadows and slight reflections. The existence of shadows and reflections can lead to an inaccurate understanding of the scene, making precise factorization difficult. To this end, we present RobIR, an implicit inverse rendering approach that uses ACES tone mapping and regularized visibility estimation to reconstruct accurate BRDF of the object. By accurately modeling the indirect radiance field, normal, visibility, and direct light simultaneously, we are able to accurately decouple environment lighting and the object's PBR materials without imposing strict constraints on the scene. Even in high-illumination scenes with shadows and specular reflections, our method can recover high-quality albedo and roughness with no shadow interference. RobIR outperforms existing methods in both quantitative and qualitative evaluations.


AGILE: A Novel Reinforcement Learning Framework of LLM Agents

Neural Information Processing Systems

We introduce a novel reinforcement learning framework of LLM agents named AGILE (AGent that Interacts and Learns from Environments) designed to perform complex conversational tasks with users, leveraging LLMs, memory, tools, and interactions with experts. The agent possesses capabilities beyond conversation, including reflection, tool usage, and expert consultation. We formulate the construction of such an LLM agent as a reinforcement learning (RL) problem, in which the LLM serves as the policy model. We fine-tune the LLM using labeled data of actions and the PPO algorithm. We focus on question answering and release a dataset for agents called ProductQA, comprising challenging questions in online shopping. Our extensive experiments on ProductQA, MedMCQA and HotPotQA show that AGILE agents based on 7B and 13B LLMs trained with PPO can outperform GPT-4 agents. Our ablation study highlights the indispensability of memory, tools, consultation, reflection, and reinforcement learning in achieving the agent's strong performance.