Goto

Collaborating Authors

 marine


MARINE: Theoretical Optimization and Design for Multi-Agent Recursive IN-context Enhancement

Zhang, Hongwei, Lu, Ji, Du, Yongsheng, Gao, Yanqin, Huang, Lingjun, Wang, Baoli, Tan, Fang, Zou, Peng

arXiv.org Artificial Intelligence

Large Language Model (LLM)-based agents demonstrate advanced reasoning capabilities, yet practical constraints frequently limit outputs to single responses, leaving significant performance potential unrealized. This paper introduces MARINE (Multi-Agent Recursive IN-context Enhancement), a theoretically grounded framework that reconceptualizes test-time reasoning as iterative refinement of a persistent reference trajectory, fundamentally departing from conventional one-shot or multi-sample paradigms. The MARINE refinement operator systematically converts a base model's pass@N capabilities into near-optimal pass@1 performance. Rigorous theoretical analysis establishes that minimal feasible batches maximize expected performance gains under fixed invocation budgets, while logarithmically growing batch schedules ensure continuous improvement without computational constraints. Comprehensive evaluation on the BrowserComp-ZH benchmark demonstrates state-of-the-art results, with a 685B-parameter implementation achieving 46.0% pass@1 accuracy. Meanwhile, MARINE establishes a new paradigm for parameter-efficient reasoning: an 80B-parameter model augmented with MARINE matches the performance of standalone 1000B-parameter agents, reducing parameter requirements by over an order of magnitude. Notably, within a fixed computational budget, the proposed MARINE delivers higher-quality samples to alignment and optimization processes than traditional sampling-and-ranking strategies. Consequently, it has great potential to boost post-training efficiency.



Implementation Details

Neural Information Processing Systems

Stalkers and Zealots, and their attacks cover a small area instead of a single unit. See SMAC [39] for further environment details. All hidden layers have 64 units. See our released source code for additional training details. The pseudocode for LICA is summarized in Algorithm 1. A7); in the updated version, we have removed it for all methods to avoid confusion (i.e.


EvoCurr: Self-evolving Curriculum with Behavior Code Generation for Complex Decision-making

Cheng, Yang, Wang, Zilai, Ma, Weiyu, Zhu, Wenhui, Deng, Yue, Zhao, Jian

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse domains, including programming, planning, and decision-making. However, their performance often degrades when faced with highly complex problem instances that require deep reasoning over long horizons. In such cases, direct problem-solving approaches can lead to inefficiency or failure due to the lack of structured intermediate guidance. To address this, we propose a novel self-evolve framework, EvoCurr, in which a dedicated curriculum-generation LLM constructs a sequence of problem instances with gradually increasing difficulty, tailored to the solver LLM's learning progress. The curriculum dynamically adapts easing challenges when the solver struggles and escalating them when success is consistent, thus maintaining an optimal learning trajectory. This approach enables the solver LLM, implemented as a code-generation model producing Python decision-tree scripts, to progressively acquire the skills needed for complex decision-making tasks. Experimental results on challenging decision-making benchmarks show that our method significantly improves task success rates and solution efficiency compared to direct-solving baselines. These findings suggest that LLM-driven curriculum learning holds strong potential for enhancing automated reasoning in real-world, high-complexity domains.



AVA: Attentive VLM Agent for Mastering StarCraft II

Ma, Weiyu, Fu, Yuqian, Zhang, Zecheng, Ghanem, Bernard, Li, Guohao

arXiv.org Artificial Intelligence

We introduce Attentive VLM Agent (AVA), a multimodal StarCraft II agent that aligns artificial agent perception with the human gameplay experience. Traditional frameworks such as SMAC rely on abstract state representations that diverge significantly from human perception, limiting the ecological validity of agent behavior. Our agent addresses this limitation by incorporating RGB visual inputs and natural language observations that more closely simulate human cognitive processes during gameplay. The AVA architecture consists of three integrated components: (1) a vision-language model enhanced with specialized self-attention mechanisms for strategic unit targeting and battlefield assessment, (2) a retrieval-augmented generation system that leverages domain-specific StarCraft II knowledge to inform tactical decisions, and (3) a dynamic role-based task distribution system that enables coordinated multi-agent behavior. The experimental evaluation in our proposed AVACraft environment, which contains 21 multimodal StarCraft II scenarios, demonstrates that AVA powered by foundation models (specifically Qwen-VL and GPT-4o) can execute complex tactical maneuvers without explicit training, achieving comparable performance to traditional MARL methods that require substantial training iterations. This work establishes a foundation for developing human-aligned StarCraft II agents and advances the broader research agenda of multimodal game AI. Our implementation is available at https://github.com/camel-ai/VLM-Play-StarCraft2.


Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance

Zhao, Linxi, Deng, Yihe, Zhang, Weitong, Gu, Quanquan

arXiv.org Artificial Intelligence

The advancement of Large Vision-Language Models (LVLMs) has increasingly highlighted the critical issue of their tendency to hallucinate non-existing objects in the images. To address this issue, previous works focused on using specially curated datasets or powerful LLMs (e.g., GPT-3.5) to rectify the outputs of LVLMs. However, these approaches require either expensive training/fine-tuning or API access to advanced LLMs to correct the model's output post-generation. In this paper, we tackle this challenge by introducing a framework called Mitigating hallucinAtion via classifieR-Free guIdaNcE (MARINE), which is both training-free and API-free, and can effectively and efficiently reduce object hallucinations during the generation process. Specifically, MARINE enriches the visual context of LVLMs by integrating existing open-source vision models, and employs classifier-free guidance to incorporate the additional object grounding features to improve the precision of LVLMs' generations. Through comprehensive evaluations across $6$ popular LVLMs with diverse evaluation metrics, we demonstrate the effectiveness of MARINE, which even outperforms existing fine-tuning-based methods. Remarkably, it not only reduces hallucinations but also improves the detailedness of LVLMs' generations, as assessed by GPT-4V.


Attention-Guided Contrastive Role Representations for Multi-Agent Reinforcement Learning

Hu, Zican, Zhang, Zongzhang, Li, Huaxiong, Chen, Chunlin, Ding, Hongyu, Wang, Zhi

arXiv.org Artificial Intelligence

Cooperative multi-agent reinforcement learning (MARL) aims to coordinate a system of agents towards optimizing global returns (Vinyals et al., 2019), and has witnessed significant prospects in various domains, such as autonomous vehicles (Zhou et al., 2020), smart grid (Chen et al., 2021a), robotics (Yu et al., 2023), and social science (Leibo et al., 2017). Training reliable control policies for coordinating such systems remains a major challenge. The centralized training with decentralized execution (CTDE) (Foerster et al., 2016) hybrids the merits of independent Q-learning (Foerster et al., 2017) and joint action learning (Sukhbaatar et al., 2016), and becomes a compelling paradigm that exploits the centralized training opportunity for training fully decentralized policies (Wang et al., 2023). Subsequently, numerous popular algorithms are proposed, including VDN (Sunehag et al., 2018), QMIX (Rashid et al., 2020), MAAC (Iqbal & Sha, 2019), and MAPPO (Yu et al., 2022). Sharing policy parameters is crucial for scaling these algorithms to massive agents with accelerated cooperation learning (Fu et al., 2022). However, it is widely observed that agents tend to acquire homogeneous behaviors, which might hinder diversified exploration and sophisticated coordination (Christianos et al., 2021). Some methods (Li et al., 2021; Jiang & Lu, 2021; Liu et al., 2023) attempt to promote individualized behaviors by distinguishing each agent from the others, while they often neglect the prospect of effective team composition with implicit task allocation. Real-world multi-agent tasks usually involve dynamic team composition with the emergence of roles (Shao et al., 2022; Hu et al., 2022).


New breed of military AI robo-dogs could be the Marines' secret weapon

FOX News

The U.S. Marine Corps is testing a new breed of robotic canine that can do much more than fetch and could possibly be headed to the battlefield. The Marines hope that these four-legged robotic dogs will enhance the mobility and safety of their soldiers in the future. CLICK TO GET KURT'S FREE CYBERGUY NEWSLETTER WITH SECURITY ALERTS, QUICK VIDEO TIPS, TECH REVIEWS, AND EASY HOW-TO'S TO MAKE YOU SMARTER The Unitree Go1 robot dog, nicknamed the GOAT (Grounded Open-Air Transport) by the Marines, is a four-legged machine that has a built-in AI system. It can be outfitted to carry an infantry anti-armor rocket launcher on its back. It can also be equipped with a forward-facing GoPro camera, multiple rails for extra cameras, aiming lasers, and other essential gear.


Is the US Navy using AI to prepare for the next conflict?

FOX News

Jets can be flown by A.I. and can even take off, land and participate in dogfights. It's no secret at this point that AI is taking over many industries fast, and it certainly has its positives and negatives. Some are concerned with how using this technology will impact jobs for humans, while others are thrilled to see how tasks will get done much more efficiently. CLICK TO GET KURT'S FREE CYBERGUY NEWSLETTER WITH SECURITY ALERTS, QUICK TIPS, TECH REVIEWS AND EASY HOW-TO'S TO MAKE YOU SMARTER One field that is using AI to its fullest capabilities is the U.S. Navy. Our military's defense mechanisms have improved enormously in the 21st century; however, they have never used technology quite like this.