Goto

Collaborating Authors

 raft





Discovering the dynamics of \emph{Sargassum} rafts' centers of mass

Beron-Vera, Francisco J., Bonner, Gage

arXiv.org Artificial Intelligence

Since 2011, rafts of floating \emph{Sargassum} seaweed have frequently obstructed the coasts of the Intra-Americas Seas. The motion of the rafts is represented by a high-dimensional nonlinear dynamical system. Referred to as the eBOMB model, this builds on the Maxey--Riley equation by incorporating interactions between clumps of \emph{Sargassum} forming a raft and the effects of Earth's rotation. The absence of a predictive law for the rafts' centers of mass suggests a need for machine learning. In this paper, we evaluate and contrast Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNNs) and Sparse Identification of Nonlinear Dynamics (SINDy). In both cases, a physics-inspired closure modeling approach is taken rooted in eBOMB. Specifically, the LSTM model learns a mapping from a collection of eBOMB variables to the difference between raft center-of-mass and ocean velocities. The SINDy model's library of candidate functions is suggested by eBOMB variables and includes windowed velocity terms incorporating far-field effects of the carrying flow. Both LSTM and SINDy models perform most effectively in conditions with tightly bonded clumps, despite declining precision with rising complexity, such as with wind effects and when assessing loosely connected clumps. The LSTM model delivered the best results when designs were straightforward, with fewer neurons and hidden layers. While LSTM model serves as an opaque black-box model lacking interpretability, the SINDy model brings transparency by discerning explicit functional relationships through the function libraries. Integration of the windowed velocity terms enabled effective modeling of nonlocal interactions, particularly in datasets featuring sparsely connected rafts.


ProteinZero: Self-Improving Protein Generation via Online Reinforcement Learning

Wang, Ziwen, Fan, Jiajun, Guo, Ruihan, Nguyen, Thao, Ji, Heng, Liu, Ge

arXiv.org Artificial Intelligence

Protein generative models have shown remarkable promise in protein design but still face limitations in success rate, due to the scarcity of high-quality protein datasets for supervised pretraining. We present ProteinZero, a novel framework that enables scalable, automated, and continuous self-improvement of the inverse folding model through online reinforcement learning. To achieve computationally tractable online feedback, we introduce efficient proxy reward models based on ESM-fold and a novel rapid ddG predictor that significantly accelerates evaluation speed. ProteinZero employs a general RL framework balancing multi-reward maximization, KL-divergence from a reference model, and a novel protein-embedding level diversity regularization that prevents mode collapse while promoting higher sequence diversity. Through extensive experiments, we demonstrate that ProteinZero substantially outperforms existing methods across every key metric in protein design, achieving significant improvements in structural accuracy, designability, thermodynamic stability, and sequence diversity. Most impressively, ProteinZero reduces design failure rates by approximately 36% - 48% compared to widely-used methods like ProteinMPNN, ESM-IF and InstructPLM, consistently achieving success rates exceeding 90% across diverse and complex protein folds. Notably, the entire RL run on CATH-4.3 can be done with a single 8 X GPU node in under 3 days, including reward computation. Our work establishes a new paradigm for protein design where models evolve continuously from their own generated outputs, opening new possibilities for exploring the vast protein design space.


Improving LLM-Powered EDA Assistants with RAFT

Shi, Luyao, Kazda, Michael, Schmitter, Charles, Gupta, Hemlata

arXiv.org Artificial Intelligence

Electronic design engineers often struggle to efficiently access relevant information for tasks like design verification and technology development. While large language models (LLMs) can enhance productivity as conversational agents, pre-trained open-source LLMs lack domain-specific knowledge for Electronic Design Automation (EDA). In a Retrieval-Augmented Generation (RAG) context, LLMs rely on external context but may still produce inaccurate responses. Retrieval-Augmented Fine-Tuning (RAFT) improves LLM performance, but acquiring labeled question/answer (Q/A) data in EDA is difficult. To address this, we propose using synthetic Q/A datasets to enhance LLMs with RAFT. Our results show that RAFT with synthetic data significantly boosts LLM performance for RAG-based EDA tasks. We also investigate the impact of using real user questions as Retrieval-Augmented Few-Shot (RAFS) examples for synthetic data generation. Additionally, we implement secure access control to ensure sensitive information is only accessible to authorized personnel. Finally, we assess the risk of data leakage and unintended memorization during fine-tuning with synthetic data, providing practical insights.


Rationales Are Not Silver Bullets: Measuring the Impact of Rationales on Model Performance and Reliability

Zhu, Chiwei, Xu, Benfeng, Yang, An, Lin, Junyang, Wang, Quan, Zhou, Chang, Mao, Zhendong

arXiv.org Artificial Intelligence

Training language models with rationales augmentation has been shown to be beneficial in many existing works. In this paper, we identify that such a prevailing view does not hold consistently. We conduct comprehensive investigations to thoroughly inspect the impact of rationales on model performance as well as a novel perspective of model reliability. The results lead to several key findings that add new insights upon existing understandings: 1) Rationales can, at times, deteriorate model performance; 2) Rationales can, at times, improve model reliability, even outperforming their untrained counterparts; 3) A linear correspondence exists in between the performance and reliability improvements, while both are driven by the intrinsic difficulty of the task. These findings provide informative regulations on the broad utilization of rationales and raise critical implications on the procedure of explicitly aligning language models with implicit human thoughts. Codes can be found at https://github.com/Ignoramus0817/rationales.


A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Xiong, Wei, Yao, Jiarui, Xu, Yuhui, Pang, Bo, Wang, Lei, Sahoo, Doyen, Li, Junnan, Jiang, Nan, Zhang, Tong, Xiong, Caiming, Dong, Hanze

arXiv.org Machine Learning

We investigate reinforcement learning (RL) algorithms in the context of fine-tuning large language models (LLMs) with verifiable rewards. Our focus is on mathematical reasoning tasks, which have recently received significant attention following the release of models such as OpenAI's O1 Model (Jaech et al., 2024) and DeepSeek-R1 (DeepSeek-AI et al., 2025). The dominant approach in LLM post-training has been Proximal Policy Optimization (PPO) (Schulman et al., 2017; Bai et al., 2022; Ouyang et al., 2022). However, PPO requires an additional critic network beyond the vanilla Reinforce algorithm (Williams and Peng, 1991), introducing both computational overhead and algorithmic complexity. Meanwhile, the deterministic transition nature of LLM also simplifies the problem with a relatively lower variance, many of PPO's sophisticated components may be unnecessary in this setting. This observation has inspired growing interest in designing simpler yet effective RL algorithms for post-training LLMs. Several recent works revisit Reinforce-style approaches, including ReMax (Li et al., 2023), RLOO (Ahma-dian et al., 2024; Kool et al., 2019), GRPO (Shao et al., 2024), and Reinforce++ (Hu, 2025). In parallel, other methods explore different directions beyond policy gradients. Reward-ranked fine-tuning (RAFT) (Anthony et al., 2017; Dong et al., 2023) iteratively generates n responses per prompt, filter out those with incorrect answers, and fine-tune the LLM on the remaining accepted samples.


Systematic Knowledge Injection into Large Language Models via Diverse Augmentation for Domain-Specific RAG

Bhushan, Kushagra, Nandwani, Yatin, Khandelwal, Dinesh, Gupta, Sonam, Pandey, Gaurav, Raghu, Dinesh, Joshi, Sachindra

arXiv.org Artificial Intelligence

Retrieval-Augmented Generation (RAG) has emerged as a prominent method for incorporating domain knowledge into Large Language Models (LLMs). While RAG enhances response relevance by incorporating retrieved domain knowledge in the context, retrieval errors can still lead to hallucinations and incorrect answers. To recover from retriever failures, domain knowledge is injected by fine-tuning the model to generate the correct response, even in the case of retrieval errors. However, we observe that without systematic knowledge augmentation, fine-tuned LLMs may memorize new information but still fail to extract relevant domain knowledge, leading to poor performance. In this work, we present a novel framework that significantly enhances the fine-tuning process by augmenting the training data in two ways -- context augmentation and knowledge paraphrasing. In context augmentation, we create multiple training samples for a given QA pair by varying the relevance of the retrieved information, teaching the model when to ignore and when to rely on retrieved content. In knowledge paraphrasing, we fine-tune with multiple answers to the same question, enabling LLMs to better internalize specialized knowledge. To mitigate catastrophic forgetting due to fine-tuning, we add a domain-specific identifier to a question and also utilize a replay buffer containing general QA pairs. Experimental results demonstrate the efficacy of our method over existing techniques, achieving up to 10\% relative gain in token-level recall while preserving the LLM's generalization capabilities.


Swarms of tiny robots coordinate to achieve ant-like feats of strength

New Scientist

Swarms of tiny robots guided by magnetic fields can coordinate to act like ants, from packing together to form a floating raft to lifting objects hundreds of times their weight. About the size of a grain of sand, the microrobots could someday do jobs larger bots cannot, such as unblocking blood vessels and delivering drugs to specific locations inside the human body. Sperm caught breaking Newton's third law of motion Jeong Jae Wie at Hanyang University in South Korea and his colleagues made the tiny, cube-shaped robots using a mould and epoxy resin embedded with magnetic alloy. These small magnetic particles enable the microrobots to be "programmed" to form various configurations after being exposed to strong magnetic fields from certain angles. The bots can then be controlled by external magnetic fields to perform spins or other motions.